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PREFACE 


I nspiration begins with imagination and the spirit to create. Then comes the need 
to communicate, to share an idea or thought. Grab a pencil and you can make it 
real: a picture , abstraction made concrete, ideas preserved in time. Our hearts and 
minds are moved to tell stories, to teach what we think and feel to others and learn 
the same from them. 

Of all the visual media, computer graphics is one of the newest. The computer is 
a powerful amplifier—it can take terse descriptions of the world and create pictures 
of that world, using any rules you choose. If we choose the classical rules of light, 
then we can make pictures that can pass for photographs; other rules explore other 
ways of seeing. 

The field of image synthesis , also called rendering , is a field of transformation: it 
turns the rules of geometry and physics into pictures that mean something to people. 
To accomplish this feat, the person who writes the programs needs to understand 
and weave together a rich variety of knowledge from math, physics, art, psychology, 
physiology, and computer science. Thrown together, these disciplines seem hardly 
related. Arranged and orchestrated by the creator of image synthesis programs, 
they become part of a cohesive, dynamic whole. Like cooperative members of any 
complex group, these fields interact in our minds in rich and stimulating ways. 

I find each of these disciplines inherently interesting; together they are fascinat¬ 
ing. Understanding the interplay of such diversity and exploring the connections is 
exciting, and with the understanding of such elegant ideas comes a deep satisfaction. 
That’s why I love computer graphics: it’s stimulating to the intellect and rewarding 
to the heart. 

I couldn’t find a book that presented image synthesis as a complete and integrated 
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field of study, encompassing all of the topics I just mentioned. But I love to write. 
And so this book was born. 

The big idea in this book is to lay out the rules that tell a computer how to take 3D 
shapes and lights and create a picture—one that would pass for a photograph of that 
scene if it existed. So our driving problem is the simulation of Nature’s illumination 
of a scene, the capturing of that illumination on film, and its presentation to an 
observer. Sometimes we bypass the film idea and just imagine an observer in the 
scene. We often make it easy and pretend the observer has only one eye, so we can 
ask, "Given this scene, what picture do I show to the observer to make her think that 
she’s viewing the real scene?" We use all the disciplines I listed earlier to answer this 
question, since our goal is not merely to create an image, but to create a perceptual 
response in the viewer. 

It’s all a trick! Like any visual medium, computer graphics creates illusions. Fred 
Brooks [65] has observed that our job as image synthesists is to create an illusion of 
reality—to make a picture that carries our message, not necessarily one that matches 
some objective standard. It’s a creative job. 

This book is not about how to write specific programs, or how to implement 
particular algorithms. The history of computer graphics is like any discipline of 
thought: tried-and-true ideas are constantly challenged by new ideas, and sometimes 
the older ones, once seemingly invulnerable, are found somehow deficient and fade 
away. So it is with rendering algorithms; our marketplace of ideas is a noisy and 
bustling place right now. 

But there are some ideas that I believe are fundamental, that come from the basis 
of our discipline and lie at the heart of all we do. Those are the ideas in this book. I 
have included many examples from current practice, but I rarely go into their details. 
There are lots of references, and you can find a wealth of implementation information 
in the literature. My purpose here is to discuss the underlying principles—the ideas 
that have slowly emerged as the core of our discipline. 

There are three such basic fields: human vision, signal processing, and physics. 
These are not independent disciplines; as I’ve said, much of the fun of image synthesis 
is seeing how these fields fit together. But here I have chosen to give each of these 
topics its own day on the stage, in the form of a unit of the book. The fourth unit 
pulls the first three topics together and shows how they combine to make rendering 
algorithms. I look at two of today’s most popular techniques, hierarchical radiosity 
and distribution ray tracing, as examples to illustrate the principles. Finally, the fifth 
unit contains several appendices with short topic summaries, historical notes, and 
reference data. 

I make a general argument in this book. To design and implement a computer 
system for creating synthetic digital images for people to view, you need to understand 
the physics of the world you are simulating, the appropriate methods for simulating 
those physics in the computer, and the nature of the human visual system that 
ultimately interprets the image. 
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The following few paragraphs describe the structure of the book and show how 
the discussion has been arranged to provide an accumulating body of mathematical, 
physical, and physiological information that culminates in a modern image rendering 
system. There’s too much information here for a one-semester course on image 
synthesis. Teachers may choose to present in detail only some of the information in 
this book, covering the rest at a higher level; deciding where to dig deeply and where 
to summarize lightly will depend on the instructor, the course, and the students. The 
only material that ought not be skipped is the section on notation in Chapter 4. 
With suitable summaries from the instructor to cover the gaps, students can work 
sequentially, skipping material as desired. Since the book is cumulative, I don’t 
recommend hopping back and forth. 

In Volume 1, Unit I covers the human visual system, the effects of displays on 
images, and the representation of color. The idiosyncrasies of the human visual 
system are endless; it’s a finely tuned physical and neurological system of great 
complexity, which we are only beginning to understand in a quantified way. But 
there are some large-scale features that we do understand and that are important 
to computer graphics: those are the topics I stress in Chapter 1. I discuss some of 
the ways of representing color in Chapter 2, so that you can write programs that 
manipulate color information correctly. In addition, Chapter 3 considers the effect 
of a display on an image, since the transformation of a mathematical ideal into a 
physical reality inevitably includes a change in the message. 

Unit II addresses digital signal processing. In a digital computer; we transform 
the smooth signals of everyday life into digitized, or sampled, representations. For 
example, we usually compute the color of an image only at a finite number of 
points on the display (the pixels), rather than at every infinitely small point on the 
image. This simple operation has profound repercussions, which often clash with an 
intuition born of our experience in the physical world. To ignore these effects is to 
invite a flood of visual and numerical problems, from "jaggies" or stairsteps in an 
image to an incorrect simulation with splotchy illumination and other ugly artifacts. 
To understand these issues, Chapter 4 discusses the nature of digital signals, and 
then Chapter 5 introduces the Fourier transform, which is a mathematical tool that 
reveals some of the internal structure of a signal. Like listening to an orchestral 
symphony and then looking at the complete score, taking the Fourier transform of 
a signal lets us isolate different components of the signal for closer study. A related 
tool is the wavelet transform, which is presented in Chapter 6. With these tools we 
can find ways to efficiently and accurately compute the integrals of functions. This is 
an essential part of image synthesis; in fact, much of image synthesis can be seen as 
nothing but numerical integration of various types. Chapter 7 covers the basic ideas 
of Monte Carlo integration, which is a powerful tool for handling this complex type 
of problem. 

With these analytic and comparative tools available to guide the discussion, I 
turn to more practical issues involved in rendering images. Chapter 8 discusses 
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uniform sampling, which is the process of taking a continuous signal and turning it 
into a digital representation by taking evenly spaced measurements. This process, 
though conceptually simple, introduces a Pandora’s box of unexpected problems. An 
alternative is nonuniform sampling, addressed in Chapter 9, which offers a different 
blend of advantages and disadvantages. Unit II ends with Chapter 10’s survey of the 
signal-processing methods that have proven of most use in image synthesis in recent 
years. 

Unit III, which opens Volume 2, turns to the physics of the real world. We begin 
with a study of the nature of light in Chapter 11, and then move on in Chapter 
12 to quantify the movement of energy through the world using the tools of energy 
transport. Chapter 13 presents the field of radiometry, which offers us terms and 
units for discussing the quantities and qualities of light present in different parts of a 
scene. Chapter 14 covers the physics of materials, so we have some understanding of 
how they interact with the light striking them. This leads us to Chapter 15’s study of 
the large-scale simulation of light-matter interaction, known in computer graphics 
as shading. The equations that describe how the shading on one object affects the 
shading on another involve integrals, so we look at the mathematical methods for 
manipulating and solving such integral equations in Chapter 16. By Chapter 17, 
we’ve learned enough to gather these ideas into a single equation known as the 
radiance equation, which gives the basic structure for how light moves through an 
environment. This is the single most important equation in image synthesis, and 
every digital image based on geometrical optics is always an approximate solution 
of it. 

The presentation of the radiance equation crowns the theoretical development 
covered in this book. Rendering practice is largely involved with finding ways to 
accurately and efficiently solve this equation. Because a complete analytic solution 
appears impossible in any but the most trivial environments, we must cut corners, 
simplify, and otherwise approximate everything involved in image-making, from the 
geometry of the scene to the physics of the simulation. The methods of digital signal 
processing give us the tools to understand which approximations are reasonable and 
what their effects will be, so we can choose our simplifications in a principled way. 

Unit IV demonstrates how the ideas in the first three units may be combined to 
make a complete rendering algorithm. I present the popular techniques of radiosity 
and ray tracing in Chapters 18 and 19 by applying different sets of assumptions and 
simplifications to the radiance equation. Chapter 20 returns to the themes of Unit 
I and discusses how displays affect the perception of a computed image. I present 
some ideas for compensating for this distortion. The unit ends with Chapter 21, in 
which I offer a few opinions about where I think image synthesis is headed. 

Unit V consists of seven appendices. Appendices A-D offer reference material on 
linear algebra and probability, some historical discussion of reflection and refraction, 
and a catalog of analytic form factors for computing radiation exchange. Appendix 
E provides a summary of useful constants and units, Appendix F an interpretation 
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of the two most popular standards for describing real physical lighting instruments, 
and Appendix G measured spectral emission and reflectivity data for a wide variety 
of materials. For your convenience, the bibliography and index are printed at the 
end of each volume. 

The language of geometry, signals, and physics is largely written in mathematics. 
So there are mathematics in this book, because that’s the best way people have found 
for expressing clearly, simply, and precisely what are usually very simple and elegant 
ideas. I’ve tried to use the most straightforward math possible at all times. This 
may mean I’ve used some notation that’s unfamiliar to you. It’s all explained, and 
I hope it’s not at all tricky. There’s lots of discussion about the equations and what 
they mean, and it builds slowly. If you flip through the book now and something 
looks daunting, don’t be concerned: by the time we reach the complex-looking stuff 
it won’t be complex at all, because you’ll know how to read it. 

If you know something about linear algebra (vectors and matrices), and you 
remember the basic ideas of calculus (what integrals and differentials are, even if 
you’re rusty on the mechanics), then you have everything you need to get through 
this book. There’s a short appendix on probability if you’re unfamiliar with that 
field; everything we use in the text is covered there. The occasional forays into other 
areas of math are well-paved. I encourage you to consult standard math texts when 
you want to, but I hope that you will infrequently need to. 

This book does not consider all of computer graphics—such a book would be 
a huge undertaking. I address only image synthesis: the job of converting a scene 
description into a picture. There are many other important subfields in computer 
graphics, including implicit and explicit modeling, motion control, compositing, 
lighting, and more. You can find discussions of these topics and pointers to more 
literature in the general textbooks. A good introductory text is Hearn and Baker 
[199]. More encyclopedic and detailed discussions are available in Foley et al. [147] 
and Watt and Watt [473], A general introduction without math may be found in my 
book for artists and designers [159], 

If you’re studying on your own, make use of the references; there’s a world of 
alternate explanations of almost everything in here. If you can study with a friend, 
I encourage you to do so; it’s easier and often much more pleasant than working on 
your own. I have always learned at least as much from my colleagues as I have from 
my teachers. 

I hope that this book is useful both to the student studying independently and 
the student in the classroom. There are some exercises at the end of each chapter. 
These ask mostly for prose descriptions and discussions, rather than mathematical 
manipulation; the goal is to think about what the math represents, not the mechanics 
of how it accomplishes the representation. If the ideas are in place, the mechanics 
will come; going in the other direction is much harder. 

I enjoy computer graphics. I like math and I like art, and image synthesis stimu- 
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lates me analytically and emotionally. This book shares with you what I feel are the 
most important and rewarding ideas in image synthesis. 
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THE HUMAN VISUAL 
SYSTEM AND COLOR 

I cannot forbear to mention among these precepts 
a new device for study which, although it may 
seem but trivial and almost ludicrous, is 
nevertheless extremely useful in arousing the mind 
to various inventions. And this is, when you look 
at a wall spotted with stains, or with a mixture of 
stones, if you have to devise some scene, you may 
discover a resemblance to various landscapes ... 
or strange faces and costumes, and an endless 
variety of objects, which you could reduce to 
complete and well drawn forms. 


Leonardo da Vinci 




INTRODUCTION TO UNIT I 


I n this unit we will discuss color images and their perception by human beings. 

I believe it is important that creators of images understand how people see, and 
how they react to what they see. 

There are two principal reasons to create a synthetic image: for analysis by a 
computer, or for display to a person. If we are creating an image for a compute^ 
then we don’t even need to actually display the image; we need only create a set of 
color values and give them to an analysis program. 

But when we present an image to a person, our task is much more difficult. No 
matter what specific purpose has brought us to create a picture, our primary and 
essential desire is to communicate something to another person. We want to get an 
idea into someone else’s head, and we are going to do it through that person’s sense of 
vision. Anything in our image that doesn’t make it through the visual system will be 
imperceptible by the viewer; we have wasted our time generating such information. 
The human visual system is complex and loaded with idiosyncrasies: for example, 
we sometimes see edges where there are none, or assume an object is concave or 
convex depending on the direction from which it is being illuminated. When we 
look at a picture, we see not just the image displayed and computed, but all the 
artifacts added in by the visual system. The problem with these artifacts is that they 
become part of the message, and augment or distort the message we intend. 

If we don’t wish to waste time computing useless information, and we want to 
avoid visual artifacts that will change our message, we need to understand how the 
visual system works, at least in a basic way. The problem of the representation of 
information is the job of the designer of the image, who must plan for the perception 
of the image. 

My goal in this unit is not to cover everything interesting about the visual system 
(that would take volumes), nor even to cover everything that might be taught in an 
undergraduate vision course. Rather, I have attempted to isolate those features and 
phenomena that I feel are most important to computer graphics. 
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Introduction to Unit I 


We are a long way from fully understanding the human visual system (new 
theories are still being developed). But today’s theories provide a strong basis for 
our new work, and it is that basis we cover in Chapter 1. 

Chapter 2 addresses the description of color and its perception. We will discuss 
color because our computer programs need to calculate with color; adding and 
subtracting color representations to determine the amount of light bouncing around 
a scene and ultimately displayed to a viewer. Our goal is to understand how to 
describe colors in a way that allows us to discuss them abstractly and objectively, yet 
still correlates to how they will be perceived. As with the visual system, new color 
systems are still being introduced. 

Compared to vision and color; the field of display technology is moving very 
fast, and entirely new devices and principles are constantly replacing old standbys. 
We must say something about displays in order to have at least a feeling for how 
important the mechanics of the display process are to the presentation of an image, 
but the field is too broad and changing too quickly for us to hope to cover the field 
even superficially. Therefore in Chapter 3 I have chosen to pick just one common, 
representative sample, the CRT display, and discuss that in some detail to give an 
idea behind some of the thinking that goes into the trade-offs involved in designing 
and intelligently using a particular display. Note that the term display includes any 
presentation medium, including ink or paper or lasers projected onto granite cliffs. 

I discuss displays in this part of the book to emphasize their relationship to the 
visual system and image fidelity. We can think of image synthesis as a process that 
ends when a file of color values has been computed, so that display of this file is 
a separate problem. But the job of image synthesis isn’t complete until the image 
can actually be viewed by someone, and that requires dealing with the limitations 
and restrictions of real displays. Thus we discuss the CRT in this section as a 
representative of the types of issues involved when designing and using a real system 
for display of images to the human visual system. New hardware and software 
technologies are giving image creators increased control over the mechanisms of 
display, and their interaction with each other and the visual system is important to 
the effective display of an image. 



The mind is the real instrument of sight and 
observation , the eyes act as a sort of vessel 
receiving and transmitting the visible portion 
of the consciousness> 

Pliny (alv 23-79) 



THE 


HUMAN VISUAL SYSTEM 


1.1 Introduction 

The human visual system is composed of two major components: the eyes and the 
brain. A great deal is known about the physiology of the eye, including the operations 
of various sets of cells that seem to work in concert. Much less is understood about 
the brain, but it would be a mistake to neglect the brain as part of the visual system. 
All experiments in which an observer is asked to report on visual sensation implicitly 
include the brain’s processing of the visual signal. In this book we will not venture 
into philosophical distinctions between “brain” and “mind”; for us, the brain will 
serve as the agent of all abstract perception and reasoning. 

We will start with a review of the structure of the human eye, since it acts as the 
initial perceptual filter: signals not perceived by the eye cannot be further refined by 
the brain. We will then survey some of the important features of the human visual 
system as a whole. 
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Physiology of the human eye, shown in cross section. 


1 m2 Structure and Optics off the Human lyo 

An overview of the physiological structure of the human eye is shown in Figure 1.1. 
A “schematic eye” has been developed to facilitate high-level quantitative and struc¬ 
tural discussions of the eye; Gullstrand’s simplified (number 2) schematic eye is 
shown in Figure 1.2 [123], Some numerical values for that schematic are given 
in Table 1.1. A more complete, though more complex, schematic eye has been 
introduced by LeGrand [489]. 

Our discussion of the eye will include two common optical terms: the diopter 
and the visual angle . 

The diopter (abbreviated D) is one measure of the power of a lens. It is defined 
as the reciprocal of the focal length of the lens measured in meters. Thus a lens with 
a focal length of .1 m (100 mm) has an equivalent power of 10 diopters. 

Another important optical measure is the visual angle . This is the angle subtended 
by some structure when seen from the nodal point inside the eye, as shown in 
Figure 1.3. 

The most important structural elements in the optical path are the cornea , iris, 
pupil , lenSj and retina . 

The cornea is a clear coating over the front of the eye. The cornea has two 
purposes: it serves as a protection mechanism against physical damage to the internal 
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Gullstrand’s simplified (number 2) schematic eye. 



PlOUftl 1.3 

Visual angle is measured from the nodal point. 
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Unaccommodated 

Accommodated 




8.62 D 

Radii of 

curvature 

Cornea 

Lens anterior 

Lens posterior 

n = -1-7.80 mm 

7*2 = +10.00 mm 

7*3 = — 6.00 mm 

+7.80 mm 

+5.00 mm 

—5.00 mm 

Refractive 

indices 

Air 

Aqueous 

7/i = 1.000 

772 — 1.336 

1.000 

1.336 

Lens 

7/3 = 1.413 

1.413 


Vitreous 

7/4 = 1.336 

1.336 

Axial 

separations 

Anterior chamber 

Lens 

Vitreous 

d\ = 3.60 mm 

d.2 = 3.60 mm 

dz = 16.97 mm 

3.20 mm 

4.00 mm 

16.97 mm 

Surface 

powers 

Cornea 

F\ = +43.08 D 

+43.08 D 

Lens anterior 

Lens posterior 

F 2 = +7.70 D 

F 3 =+12.83 D 

+15.40 D 

+15.40 D 

Equivalent 

powers 

Lens 

Eye 

+20.28 D 

+59.60 D 

+30.13 D 

+68.22 D 

Equivalent 

focal 

lengths 

Anterior 

/ = —16.78 mm 

— 14.66 mm 

Posterior 

/' = +22.42 mm 

+19.58 mm 


TABU 1.1 

Gullstrand’s simplified (no. 2) schematic eye. Source : Data from Davson, ed., The Eye , 4:103. 


structure, and it provides initial focusing and concentration of the incoming light. A 
typical human cornea has an optical power of about 40 diopters, due to its curvature 
and the refraction (or bending of light) that occurs when the light passes from air 
into the corneal tissue. The cornea is the strongest focusing element in the eye. 

The iris is a colored annulus behind the cornea but before the lens. The iris 
contains radial muscles that allow it to change the size of its inner hole, the pupil. 
Only light passing through the pupil proceeds further into the eye. 

Light passing through the pupil opening then strikes the transparent crystalline 
lens. The lens is surrounded by a set of muscles called the ciliary body , which can 
pull at the sides of the lens. When the ciliary muscles are relaxed, the lens is stretched 
radially, flattening it and reducing its optical power; the light entering the eye is now 
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brought to a focus as far from the lens as possible. The ciliary muscles may be tensed 
to exert a compressive force on the lens: its diameter shrinks, the lens becomes 
thicker the optical power increases, and the focal point moves closer to the lens. 
Thus when the muscles are relaxed, the lens has its longest focal length. When the 
muscles tense, the lens is focused on nearer objects. 

The ability of the lens to stretch in reaction to the pressure from the ciliary body 
is called accommodation . The range of accommodation is a function of elasticity, 
which diminishes with age. In a young child the lens typically has a range from 10 to 
30 diopters. Past age 45 the lens has usually lost most of its elasticity, and remains 
in a rigid, slightly stretched state [123]. 

Light focused by the lens falls on the retina , a thin but extensive layering of 
cells covering about 200° on the back of the eye. The retina contains two types 
of photosensitive cells: rods and cones. Cones are primarily responsible for color 
perception; rods are limited to intensity, though they are typically ten times more 
sensitive to light than cones. Rods are also physically smaller structures than cones, 
so more of them may be packed into any given space, improving spatial resolution. 

Although most of the retina is photosensitive, there is a small region at the center 
of the visual axis known as the fovea y which subtends only 1 or 2° of visual angle. 
The structure of the retina is roughly radially symmetric around the fovea. The fovea 
contains only cones, and it is here that we find the densest collection of cones on the 
surface of the retina: linearly, there are about 147,000 cones per millimeter. 

In contrast, the soaring hawks (buteos) have as many as 1 million receptors in 
the same area [412]. Because their optics are also somewhat specialized, hawks may 
have vision as much as eight times better than ours; they can see a small object on 
the ground at a distance from which we could not even see the bird in the sky. 

Moving outward from the fovea, rods begin to appear among the cones, and 
at the edge of the fovea there are more rods than cones, as shown in Figure 1.4. 
Traveling further on a radial path from the fovea, the rods begin to form rings 
around each increasingly infrequent cone, as shown in Figure 1.5 (color plate). The 
highest density of rods appears at about 20° from the fovea. In total, the human eye 
contains about 120 million rods and 6 million cones. Since the optic nerve contains 
only about 1 million fibers, the eye must perform a lot of processing before the visual 
signal ever reaches the brain. 

There are two important aspects of Figure 1.4 that deserve mention. The first is 
that the number of photoreceptors diminishes as we work our way outward from 
the fovea. This would suggest that we have our greatest visual acuity in the region 
in the center of our visual field, and less precision as we work our way out. The 
second feature of the graph is the blind spot , where the optic nerve meets the retina 
and there are no photoreceptors at all. 

Figure 1.4 is based on classic work performed by 0sterberg in 1935, and repre¬ 
sents photoreceptor counts only along one radial line through the retina. A more 
recent series of studies has produced a far more detailed set of maps of the dis- 
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Density of receptors. Redrawn from DeValois and DeValois, Spatial Vision , fig. 3.4, p.60. 


tribution of photoreceptors in the retina. Curcio et al. [112] have measured the 
population of rods and cones and produced the map of Figure 1.6 (color plate). 
In these color-coded maps of the retina, the fovea is always at the center, and the 
orientation is consistent (though the scale factor with respect to the center changes). 
The color scales indicate the number of thousands of cells per square millimeter. 

Figure 1.6(a) shows the density of cones in the fovea, over the entire retina. A 
close-up of the fovea is shown in (b). Parts (c) and (d) show the same regions but 
plot rod density; a close-up of the rod density near the fovea is shown in (e). 





1.2 Structure and Optics of the Human Eye 11 

These figures confirm our earlier statements. From Figure 1.6(a) we see that there 
aren’t very many cones in the outer portions of the retina, but that the number jumps 
suddenly when we reach the fovea. In (b) we see that this increase is quite abrupt. 
Rods, however, are numerous in the retina outside the fovea, and from Figure 1.6(c) 
we can see that at about 6 mm from the center there is a particular high-density 
annulus called the rod ring . The density of rods falls off slowly as we approach the 
fovea, and then drops off suddenly; this drop-off can be seen in more detail in panel 
(d). An even closer view in (e) shows that the rod density drops to zero right in the 
center of the fovea in the rod-free zone . Note that the rod-free zone in (d) is precisely 
where the cones are densest in (b). 

The change in photoreceptor density is directly related to a change in our per¬ 
ceptual acuity in the image falling on that part of the retina. To demonstrate the 
changing acuity in our gaze, consider Figure 1.7. Close one eye, and hold this 
image about arm’s length directly in front of your open eye. Stare fixedly at the 
center. Because the larger numbers are projected onto the less populated region of 
the retina, they will be fuzzier, though the smaller numbers will be sharper. So all of 
the numbers in the figure should be equally legible. 

There are many ways to demonstrate the blind spot, but we must be careful to 
distinguish the purely physical effect from additional psychological effects. Some¬ 
times the visual system will “fill in” information that is logical, but not explicitly 
presented in a scene; such filling in processes are known collectively as completion 
phenomena. We must then be sure that in attempting to demonstrate a physical 
effect, we isolate it as much as possible from further layers of processing. It can be 
difficult to prevent all completion phenomena, since our experience tells us that we 
seem to see a complete visual field all the time. Given that there is a region of the 
retina where there are no photoreceptors, we must be filling in information all the 
time; otherwise we would see a constant black spot everywhere we look. 

To demonstrate your blind spot, look at Figure 1.8. Close your left eye, and hold 
the figure about arm’s length away from your right eye. Stare fixedly at the cross on 
the left. You may need to move the figure toward or away from you, but at some 
distance the black dot should seem to disappear; at this position the dot is falling 
on the blind spot, and the visual system is completing the white background in this 
region. 

Returning to the anatomy of the eye, the combination of cornea and lens provides 
a total optical power ranging from +60 to +80 D, which translates to a focal length 
from about 16 to 12.5 mm. A typical human eye is about 24 mm from cornea to 
retina, which requires an optical power of about 42 D. Thus there is some extra 
focusing power available in the system to compensate for imperfect shaping of the 
eye, in addition to the flexibility of optical power required to focus on objects from 
very near to very far. 

As an example of the variation in the shape of the eye, consider eccentricity , 
one of the most common structural defects in the human eye. An eccentric eye is 
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When you look at the center dot in this figure with one eye, all the letters should be equally legible. 
Redrawn from Sekuler and Blake, Perception , fig. 3.20, p. 88. 
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Fixation cross 


MOURI 1.8 

A diagram for demonstrating the blind spot. 
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Eye geometry, (a) A normal (emmetropic) eye. (b) A myopic eye. (c) A hyperopic eye. 


either too long or too short. If the focal length of the resting eye’s optical system 
converges at the retina, the eye is called emmetropic , and the lens has sufficient 
power to focus on objects both near and far (Figure 1.9(a)). Note that an eye need 
not be physiologically ideal to be emmetropic; if the eye is too long but the lens 
is correspondingly weak, the focus can still be brought to the retina, and vision is 
normal. 

If the lens is normal but the the eye is too long, then the eye is myopic; people with 
a myopic eye structure are often called nearsighted . When the muscles around the 
lens are at rest, then light is focused at a point in front of the retina (Figure 1.9(b)). 
Tensing the ciliary muscles only increases the optical power of the lens, which brings 
the focal point yet closer to the lens, making the problem worse, not better. Until 
objects are very neat; the lens cannot bring them into focus, because the lens can 
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only increase its optical power. Corrective lenses are the most common means for 
helping people with myopia achieve normal vision. 

The opposite problem occurs if the eye is too short, called a hyperopic eye. When 
this eye is at rest, the focal point of the lens is behind the retina (Figure 1.9(c)). 
Objects at infinity may be brought to focus by accommodative effort; hence the term 
farsighted . However, when an object gets too close then the lens cannot compress 
any more, and objects closer still will be out of focus, even though the lens is at 
its maximal optical power. In effect, there is extra, useless optical power left over 
to bring into focus objects “beyond infinity.” Corrective lenses also help hyperopic 
eyes achieve a normal range of accommodation. 

Although we have not said so explicitly, Figure 1.9 describes only a single color 
of light at a time. Recall that a prism breaks up white light into a rainbow because 
of refraction: different colors of light are bent by different amounts when they pass 
from one medium to another. This is also true when the light passes through the 
lens of the eye, so that a sharp white circle is in fact spread out by the time it reaches 
the retina into a little circular rainbow; this inevitable effect is called chromatic 
aberration in the lens. 

This suggests one reason why artists think of red as an “advancing” color and 
blue as a “receding” one [360]. Because the different colors bend slightly differently 
as they pass through the lens, we must exert effort to change the shape of the lens 
to bring the various colors to focus on the retina. To bring a red object to focus 
requires the same action needed to bring a near object to focus, while blue focusing 
is like focusing on a distant object. 


1.3 Spectral and Temperal Aspects eff the HVS 

The human visual system involves much more than just the eye. Once the light 
has been focused on the retina, many layers of physiological and psychological 
systems process the information, rejecting some pieces of information, emphasizing 
others, and shaping the signal into something that we can then interpret, often as 
representative of physical structures. 

There is a distinct band of electromagnetic energy to which the eye is sensitive, 
usually called the visual range or visual band . Although the range of sensitivity 
extends into both the infrared and ultraviolet range (albeit at very low sensitivi¬ 
ties), for practical purposes the visual range is usually defined to include light with 
wavelengths from 380 to 780 nanometers (1 nanometer = 1 nm = 10 _9 m). We will 
defer a detailed discussion of the nature of light and the meaning of wavelength until 
Chapter 11. For now, the term wavelength may be thought of as corresponding to 
a particular pure (or spectral) color, such as that produced by a laser. Throughout 
this book we will indicate the range 380 to 780 nanometers with the symbol 7?y 
A visual signal is often represented as a plot of intensity versus wavelength, as in 
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An example of a power-versus-wavelength curve. 


Figure 1.10. This is called a spectral radiant power distribution , a spectral plot , or 
sometimes simply a spectrum. 

Such a plot may be used to measure how much light is absorbed (rather than 
radiated) per wavelength by a material. In this case the vertical axis is usually the 
percentage of absorption. 

The photosensitive cells of the eye are not uniformly responsive to all wavelengths 
in the visible range, and the processing that comes after the eye serves to further refine 
the ultimate importance of various regions of the spectrum to an interpretation of 
the image. 

The first step in processing light information is the reception of the light signal by 
the photosensitive cells on the retina. Although most of these cells have a long, thin 
structure, they are not packed into the retina parallel to each other. Rather, they are 
tilted toward the center of the pupil. The result is a directional sensitivity known 
as the Stiles-Crawford effect , in which cones are more responsive to light arriving 
straight on than at an angle through the edge of the pupil [123]. 

Once light has managed to reach the photosensitive material in a rod or cone, it 
causes a chemical action that results in a neural signal. The chemical at the heart 
of this process has the generic name photopigment. The particular photopigment 
found in rods, rhodopsin , has been studied extensively. It has been found that 
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Photopigment absorption. 


rhodopsin reacts to light in a bell-shaped curve, centered at about 500 nm. This is 
in agreement with the sensitivity of the human visual system during night vision, 
when there is not enough light to stimulate the cones, making the rods the dominant 
sensor. Cone photopigments have rarely been extracted from primates, but many 
psychophysical, psychological, and microspectrophotometric studies have been run 
on primate and visual observers. The results of these experiments have yielded 
consistent information that is probably a reliable description of cone sensitivity. 
This information is summarized below. 

There are three types of cones in the human eye, typically called S', M, and L 
(named respectively for their peak response to relatively short, medium, and long 
wavelengths), with peaks located at roughly 420, 530, and 560 nm, as shown in 
Figure 1.11. The response curves for these cones (as well as the rods) are asymmetri¬ 
cal; the drop-off at the high-frequency side is sharper than at the low-frequency side. 
Thus the shorter wavelengths are more readily absorbed than the longer wavelengths 
for all three ranges. Both rods and cones may be considered the ultimate in visual 
sensitivity: a single photon carries enough energy to produce the chemical reactions 
that change the electrical potential at the cell’s membrane, signaling the arrival of 
light at that cell. 
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The signal carried by the change in membrane potential makes up the entire mes¬ 
sage sent by a photoreceptor to the rest of the visual system. Thus the only message 
sent by a rod or cone is that light has arrived and stimulated the photopigment; 
there is no information transmitted describing the wavelength of the photon. This 
effect is called the principle of univariance [123]. The likelihood of absorption of 
a photon by a particular cell is a function of the spectral sensitivity of the receptor 
and the intensity of the incoming light (e.g., if the receptor is 30% sensitive at some 
wavelength, any particular photon may not be absorbed, but about 30 of every 
100 will). The time-averaged output of a photoreceptor is related to the number 
of photons received over some recent interval, but there is no way to determine the 
frequency distribution of these absorbed photons. It is only by combining the results 
of many photoreceptors with different spectral sensitivities that the visual system 
is able to reconstruct intensity and color descriptions of the incoming signal; this 
reconstruction is believed to happen at a very early stage in visual processing. 

The principle of univariance may at first seem puzzling: why should the visual 
system have developed in such a way that the very first step in processing throws 
away information that then must be re-derived? The answer is probably similar 
to the reasoning behind the process of dithering , used in graphics when a display 
cannot provide as many colors or gray levels as an image demands [445]. Suppose 
that the eye contained many distinct color sensors with different, narrowly defined 
bands of absorption. Although they might be as close-packed as cones, the number 
of sensors for any particular frequency band in a fixed region would necessarily be 
fewer than if only three types of cones occupied the space, thereby sacrificing spatial 
color resolution. The human eye has evolved with a compromise of three sensors, 
which gives good color sensor density in the retina and a sufficient amount of color 
information to recompute the spectral information of the incident signal. Either the 
number of sensors or their density could be theoretically increased at the expense 
of the other. In fact, the density trade-off can be found in the very center of the 
human fovea. Here there are no S cones to be found at all, so M and L cones are 
able to pack even more tightly [463]. At the other extreme, some birds have five 
to seven different color receptors (produced by a combination of the photoreceptors 
themselves and a layer of oil) [412]. 

Not so easily explained is the curious structure of the retina itself. Surprisingly, 
the photosensors are not the innermost layer of cells on the inside of the retina. 
Rather; there are several layers of interconnecting cells on top of the photoreceptors, 
blocking the light from the lens. The overall density of these cells is quite low, so 
most of the incident light gets through. Even more surprising is the fact that the 
photoreceptors themselves are oriented so that they face the back of the eye rather 
than the pupil, so light must travel through the body of the photoreceptor before 
it reaches the photopigment that will trigger a response [123]. These two pieces of 
physiology have suggested to some that the retina appears to have evolved “inside- 
out” from the structure that we would probably think most efficient. What forces 
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caused the eye to evolve this way? Are there indeed advantages that we don’t yet 
appreciate? Although many people now believe that the retina is simply the result 
of an early, mysterious evolutionary preference, these puzzles continue to interest 
researchers in the physiology, structure, and function of the visual system. 

This seemingly reversed structure of the visual system is common to all vertebrates 
[412]. It suggests that, for vertebrates, eyes are actually part of the brain and 
represent an outgrowth from it. In fact, the cells of the retina are formed during 
development from the same cells that generate the central nervous system; the retina 
truly is part of this essential structure [463]. In contrast, invertebrate eyes come 
from an invaginated bubble in the skin. The photoreceptors in invertebrates all face 
toward the lens, while in all vertebrates they face away from the lens and toward the 
brain. Spiders are unique in that they have both forms of eyes [412]. 

So far we have only discussed the response of the eye to a single photon. In fact, 
the chemical processes that occur inside a photoreceptor last several milliseconds, 
and additional photons that strike the receptor during that time add to the overall 
response. Thus the output of a receptor is really a time-averaged response, an effect 
called temporal smoothing . In effect, the sensors impose a low-pass filter over their 
time response, though the cutoff frequency of that filter changes with respect to the 
background light level: when there is little light arriving, there is little smoothing. 

The effect of temporal smoothing leads to the way we perceive light that blinks, 
or flickers . When the blinking is slow, we perceive the individual flashes of light. 
Above a certain rate, called the critical flicker frequency (or CFF), the flashes fuse 
together into a single continuous image. Far below that rate we see simply a series 
of still images, without an objectionable sense of near-continuity. 

Under the best conditions, the CFF for a human is around 60 Hz [389]. In 
contrast, a bee has a CFF of about 300 Hz. We note that as with most other visual 
phenomena, the flicker rate (that frequency at which flicker becomes noticeable) is 
dependent on many factors, such as ambient light, size of the visual target, and duty 
cycle between the length of time the image is displayed and the blank time (if any) 
between images. For one set of conditions, Figure 1.12 shows the sensitivity of the 
eye to different frequencies of flicker. Very early movies flickered because there were 
not enough frames displayed per second to cause the eye to integrate the images; 
they were perceived as a flickering series of still photos. 

We saw earlier that a sensor reacts to an incoming photon with a chemical change, 
which is then communicated to the neural circuitry in the eye by a change in electric 
potential at the cell’s membrane. There is an additional complication, however, 
that enables the eye to respond to enormous variations in levels of incoming light. 
The phenomenon of adaptation gives the system great sensitivity when the overall 
illumination is low, and some (though less) sensitivity when the overall illumination 
is high [123]. Although maximum sensitivity over all illumination ranges would be 
best, this appears to be a difficult problem for any receiving system. Given the need 
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Flicker sensitivity. Redrawn from Sekuler and Blake, Perception , p. 254. 


to compromise, it seems very desirable to have the most sensitivity at low light levels 
where small variations carry a great deal of information. 

The range of adaptation is extremely large. Figure 1.13 gives the average lumi¬ 
nance of background against which we often view the world [355]. The luminance is 
measured in candelas per square meter, which may be considered the light generated 
by a typical candle (a more formal definition is given in Appendix E). 

Because rods are about ten times as sensitive as cones, they are most useful for 
night (or scotopic) vision, when ambient light levels are low. Figure 1.14 shows 
the response of rods to different levels of incident light, and thus different levels of 
adaptation. At low levels of light ( La ), rods in their “normal” state are sensitive 
in terms of both amplitude and wavelength; a small number of photons is likely to 
produce a signal, and a change in the average wavelength will produce a change 
in response. At higher light levels (£#), the intensity-response curve has begun to 
flatten out, and rods are less sensitive to both the number of photons and changes 
in wavelength. Beyond a certain intensity (Lc), the rods are hyperpolarized , or 
completely saturated, and release no synaptic chemicals, and thus do not contribute 
to vision. This saturation typically occurs at daylight levels of illumination. 

In daylight (or photopic) levels of illumination, it is the cones that are the most 
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Background 

Luminance (candelas 

per square meter) 

Horizon sky 


Moonless overcast night 

0.00003 

Moonless clear night 

0.0003 

Moonlit overcast night 

0.003 

Moonlit clear night 

0.03 

Deep twilight 

0.3 

Twilight 

3 

Very dark day 

30 

Overcast day 

300 

Clear day 

3,000 

Day with sunlit clouds 

! 30,000 

Daylight fog 


Dull 

300-1,000 

i 

Typical 

1,000-3,000 

Bright 

3,000-16,000 

Ground 


Overcast day 

30-100 

Sunny day 

300 

Snow in full sunlight 

16,000 
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Luminance of everyday backgrounds. Source: Data from Rea, ed., Lighting Handbook 1984 
Reference and Application , fig. 3-44, p. 3-24. 


useful detectors of light information. When a cone has adapted to a particular level 
of light intensity, it performs just like the rods: light intensities beyond a particular 
level will cause the cone to hypersaturate and stop sending neural signals. For 
example, in Figure 1.14 a cone that is adapted to light level Lc will not be able to 
distinguish light levels L D and L E > However, if we assume that the incident light is 
at level Ld for some time, the cone will adapt, shift its response curve to center at 
that point, and thus be able to distinguish light levels Ld and L E . 

You may augment the frequency response information discussed above with the 
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Rod and cone adaptation. Redrawn from DeValois and DeValois, Spatial Vision , fig. 3.14. 


further processing carried out by the rest of the visual system by performing psy¬ 
chological and neurophysiological experiments. In the final analysis, you may distill 
the results into a final set of curves that provide the overall frequency sensitivity of 
the human visual system at some particular level or range of illumination. Often 
two curves are presented: one for low-level (scotopic) illumination, where the rods 
provide the most information, and the other at high-level (photopic) illumination, 
where the cones predominate. Typical scotopic and photopic luminous efficiency 
functions are given in Figure 1.15. 

Note that there is a shift in the frequency of peak sensitivity due to the different 
photopigments of rods and cones. You can experience this change in peak perception, 
called the Purkinje shift , by watching a red or yellow flower with dark green leaves 
at sunset. When the sun is still above the horizon, your cones are active, and the 
yellow flower will appear lighter than the leaves because yellow is closer to peak of 
the photopic sensitivity curve than dark green. When the sun has set and light levels 
are lower, your rods are the principal sensors. The scotopic sensitivity curve is more 
responsive in the shorter wavelengths, so the green leaves will now appear relatively 
lighter than the yellow flower, though both will of course be much darker due to the 
lower amount of incident light. 
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Luminous efficiency curves. Redrawn from Wyszecki and Stiles, Color Science , fig. 2(4.3.2), p. 258. 


Rods and cones can only respond to the light that reaches them. As we mentioned 
earlier, the light must pass through the inner layers of the retina, which can absorb 
some light. The light must also pass through the eye itself, going through the lens 
and the other components of the eye. For example, the lens in the human eye 
changes color with time, becoming increasingly yellow as a person ages [489]. Thus, 
the lens acts as a yellow filter, which obviously affects the spectral distribution of 
light striking the retina. Measurements of the transmissive characteristics of the 
eye have been carried out by Boettner and Wolter [52]; their data are summarized 
in Figure 1.16. Note the transmission curves are both of high magnitude and flat 
within the visual band. 
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Transmittance of the human lens. Solid curve = total transmittance of a 4-1/2-year-old lens. Long- 
dashed, short-dashed, and dotted curves = direct transmittance of 4-1/2-, 53-, and 75-year-old 
lenses, respectively. Redrawn from Boettner and Wolter in Investigative Opthalmology , fig. 7, 
p. 781. 


1.4 Visual Phenemena 

The human visual system is sufficiently complex that much of our understanding 
comes from trying to understand intriguing phenomena that are revealed by physical 
experiments. Some of these are familiar in computer graphics because we produce 
images that tend to exaggerate these effects; others are less well known in the graphics 
community. 

We present here a short summary of some of these phenomena. 


1.4.1 Contrast Sensitivity 

Suppose that an observer is shown a sheet of paper with reflected intensity /, and 
inside there is a smaller sheet with a slightly different intensity I + A/, as in Fig- 
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noun 1.17 

A contrast sensitivity experiment, (a) A small region in a larger one. (b) The just-noticeable- 
difference curve. 


ure 1.17(a). We would like to find the smallest just noticeable difference (or jnd) A/ 
such that the observer will report that the inner region is of a different intensity than 
the outer region. Over a wide range of intensities, the ratio A I/I (called the Weber 
fraction) is nearly constant with a value of about 0.02, as shown in Figure 1.17(b) 
[345]. This curve is known as the contrast sensitivity function , or CSF. 

The curve of Figure 1.17(b) suggests that the human visual system is responsive 
to ratios of intensities, not absolute values. We note that dl is the limit of A/, and 
that d[log(/)] = dl/I. This suggests that there is a constant k such that increasing the 
logarithm of a signal by k corresponds to a just-noticeable difference in the intensity. 

We can also measure contrast sensitivity with respect to a signal of changing or 
constant frequency. A common such signal is a grating , which is simply a series of 
vertical bars. If the bars have sharply defined edges, a horizontal profile through the 
image would look like a square wave; a smoother transition would have a profile 
more like a sine wave. The frequency of a grating is measured by the number of cycles 
per millimeter on the retina; our response to different gratings is called the contrast 
sensitivity function (CSF). The response of a human adult to sine-wave gratings 
of different frequencies is shown in Figure 1.18. Note that for each frequency a 
certain amount of contrast is required to perceive the grating; if the contrast is lower 
than this amount, we see only a flat gray field. For a particular contrast, there is 
some frequency of sine wave which we are best able to detect. For frequencies that 
are higher and lower than that peak, we require more contrast in order to see the 
variation. 

Our contrast sensitivity is also dependent on whether we are using our rods or 
cones. Figure 1.19 shows the difference in our CSF for scotopic (night) and photopic 
(day) vision. 
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Contrast sensitivity for sine waves. Redrawn from Sekuler and Blake, Perception , fig. 5.18, p. 155. 


Adaptation plays an important part in our contrast sensitivity. When the eye has 
adapted to a particular frequency, the sensitivity to information at and near that 
frequency is decreased, as shown in Figure 1.20. 

Figure 1.21 shows the CSF for a human infant and an adult. Notice that our 
sensitivity increases with age. An important implication of this curve is that infants 
cannot see high-frequency information as well as an adult. To an infant, the world 
beyond a short distance appears blurry, as with extreme myopia. As the child ages 
through its first year, its nervous system becomes more complex and capable of 
encoding the high-frequency information that is striking its retina. As its ability to 
transmit high-frequency information matures, the world comes into sharper focus. 

As we age beyond about 20, our sensitivity to high frequencies begins to drop off, 
as shown in Figure 1.22. This decrease in sensitivity probably comes from a decrease 
in the pupil size of the eye [389], which decreases the amount of light arriving at the 
retina. 





Contrast sensitivity (1/threshold contrast) 



FIOUKI 1.20 

CSF in response to frequency adaptation. Redrawn from Sekuler and Blake, Perception , fig. 5.28, 
p. 167. 


Contrast sensitivity (1/threshold contrast) 









MOUKI 1.32 

CSF for an adult from about 20 to 80. Redrawn from Sekuler and Blake, Perception , fig. 5.26. 
p. 164. 
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PIOURI 1.23 

The CSF with respect to orientation. Redrawn from Bouville et al. in Proc. Eurographics *91, 
%• 1 . 


This response depends on direction. Our ability to resolve a grating of a given 
frequency and contrast is best when that grating is horizontal or vertical [433] as 
shown in Figure 1.23. 

A full discussion of the CSF could easily fill a chapter; interested readers are 
encouraged to consult the references in the Further Reading section. 


1.4.2 Noise 

Many human senses are tolerant of noise. For now, we will simply consider noise 
to be a signal that seems to have a strong random component that is added in to the 
signal we care about. An example from the audio domain is tape hiss , which is the 
sound made by blank audio tape. A visual example is static on a television signal, 
where colors are occasionally wrong and there is a sprinkling of white or black spots. 

As long as this noise isn’t too extreme, the human visual system tends to be very 
good at ignoring it [218]. This is probably the result of how the photoreceptors are 
distributed on the inside of the retina [479,496]. This relative acceptance of noise 
will prove to be of great value to us when we discuss the phenomenon of aliasing 
and ways to control it, in Units II and III. 
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1.4.3 Mach Bands 

People who make Gouraud-shaded polygonal images are familiar with Mach bands. 
Named for the Austrian physicist Ernst Mach, Mach bands are an illusion that 
variously emphasizes edges or suggests edges in a picture where the intensity is in 
fact changing smoothly. 

Figure 1.24 shows a set of vertical gray bars. Beneath them is a plot of their gray 
values. Look near the boundary between any two bars. Although the intensity is 
constant across each bar, the right side of each bar appears a bit darker than the 
middle of the bar, and the left side appears a bit lighter. The transition from one bar 
to another is emphasized by these illusionary changes in the intensity. 

This sort of figure prompts the folklore theorem in computer graphics that Mach 
bands arise where the first derivative of the intensity is discontinuous. In this case, 
we have Mach bands around spikes in the first derivative. 

In Figure 1.25 we have a smooth gray transition, yet we still see vertical bands 
where the intensity changes quickly. Here all the derivatives of the intensity signal 
exist and are smooth, so our folklore isn’t a complete predictor of the problem. 

The origin of Mach bands is not completely understood, but a reasonable ex¬ 
planation involves the retinal ganglion cells [388]. In a simplified model of the eye, 
these cells act as weighted integrators of the intensity signal coming from the pho¬ 
toreceptors. The integration is organized spatially; the geometric arrangement of the 
photoreceptors is part of how they are interpreted. The type of retinal ganglion cell 
we will consider integrates over a small circular region on the retina. These cells sum 
the photoreceptor response in the center of this region, and subtract the photorecep¬ 
tor signal in the annulus outside this disk but within the region of integration. The 
effect of some cells reducing the response of nearby cells is sometimes referred to as 
lateral inhibition. 

Figure 1.26 shows four of these cells overlaid on a pair of bars. Cell A is 
completely covered by the darker bar and cell D by the lighter one. The additive 
center of cell B is on the darker bar but its subtractive outer annulus is partly on the 
lighter bar. Because not as much signal is subtracted from B as from A, cell B will 
report a slightly darker value. Similarly, the additive center of cell C is in the lighter 
area, but its subtractive annulus is partly in the darker bar; more is subtracted away 
from the center of C than the center of D, so C will report a lighter value than D. 
Since this happens at all points along the boundary, and the effect increases as we 
get nearer to the boundary, the left edge of the boundary looks darker and the right 
edge lighter than the centers of the respective bars. 

This analysis is probably too simple, but it suggests that we are likely to see 
Mach bands in regions where the intensity is changing quickly; thus, a “large” first 
derivative is sufficient (though not necessary) to predict the perception of a Mach 
band. The interpretation of “large” depends on the context of the image, the viewing 
conditions, and the viewer. 
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PIOURI 1.24 

Gray wedges in equal increments of intensity. 




PIOURI 1.28 

A smooth gray transition. 
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PIOURI 1.26 

Neural analysis of Mach bands. 



PIOURI 1.27 

Lightness contrast. All of the interior regions are the same gray value. 


1.4.4 Lightness Contrast and Constancy 

The phenomenon of lightness contrast (also called simultaneous contrast) is illus¬ 
trated in Figure 1.27. Here we have a patch of a given gray value surrounded by 
a number of other patches of different gray values. The apparent lightness of the 
patch seems to depend on the surrounding gray value; the darker the surrounding 
gray value, the lighter the patch appears. 

This phenomenon makes it difficult for us to pick two intensities (or, with suitable 
extensions, two colors) at random and expect them to behave in predictable ways 
throughout an image. For example, a typical shorthand for representing a nighttime 
scene is a horizontal wash of color, light at the bottom (to represent the light from 
the setting sun) and dark at the top (to show the night sky), as in Figure 1.28. 

Suppose we have a flying object in this scene, such as a bird or flying saucer that 
is not shaded in three dimensions (3D) but rather has a constant shading. As the 
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MOURI 1.21 

A horizontal wash sometimes used as a background for a night scene. 


object moves vertically in the scene, it will appear to get lighter and darker. If you 
spot-check a few frames of the animation, you may see this change in lightness and 
be concerned, but the phenomenon is a normal part of our experience and does not 
need correction. When the change in the surrounding lightness is dramatic, some 
compensation may make the scene appear more natural. 

The phenomenon of lightness constancy allows us to accept a scene as the same 
in both day and night, when the level of illumination is very different. For example, 
suppose you are reading at your desk one evening. The book in front of you is 
printed on white paper that reflects, say, 40% of the incident light, and the black 
ink of the printing reflects only 5%. Now you turn on another lamp which doubles 
the illumination in the room. The black print is now reflecting twice as much light 
energy back to you, but the print doesn’t appear twice as bright. This is because the 
white page is also twice as bright, so the ratio has remained the same. 

Lightness constancy is a powerful feature of the visual system and is one of the 
phenomena that makes it possible for us to maintain a consistent mental image of 
the world, despite dramatic changes in the level of illumination. We can explain both 
lightness contrast and lightness constancy on a general level using the same ideas of 
retinal ganglion cells we used for Mach bands [389]. 
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PIOURI 1.29 

Eye placement for a rabbit and a person. 


1.5 Depth Perception 

The human visual system is capable of constructing a 3D view of the world. This 
ability, called depth perception , comes from many different kinds of visual informa¬ 
tion, some of which may be gathered from one eye alone, and some of which requires 
two eyes. 

To see how two eyes work together, consider the placement of the eyes on the 
heads of a rabbit and a person, as in Figure 1.29. The rabbit has almost 300° of 
vision, though only a small amount of the visual field is seen by both eyes. The 
human has a smaller total field of view, but the two eyes overlap in a much larger 
region. Other examples of eye placement are the snail, which has eyes on the ends 
of flexible stalks so that the regions of visibility and overlap may be changed at will, 
and the whale, which has eyes so far apart on the sides of its head that it is completely 
blind straight ahead [412]. Spiders and scorpions have clusters of at least six eyes, 
and some have eight; there is a significant amount of field overlap. 

In general, predatory animals have their eyes near the front of the head with a lot 
of overlapping field, for better depth estimation when going after prey. Conversely, 
animals that are preyed upon have their eyes far apart, the better to see more of the 
environment and respond to potential attacks. For example, the owl’s eyes have a 
very large region of overlap. The woodcock is a bird that eats small mud worms by 
sticking its long, sensitive bill deep into the mud to seek out the unseen worms. The 
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MOURI 1.30 

Types of depth information. Adapted from Sekuler and Blake [389]. 


woodcock’s eyes are perched on the extreme sides of its head, so the bird can see all 
around and behind itself while immersed in mud [340]. 

The types of depth information we gather from our visual field are summarized 
in Figure 1.30. We will discuss these cues one by one below; much of the discussion 
is based on material in Sekuler and Blake [389]. 


1.5.1 Oculomotor Doptti 

Oculomotor effects come from the muscular adjustments in our eyes. When you 
look at something, you use the muscles surrounding your eye to converge them, or 
physically rotate them to bring the point of attention, or fixation point , to fall on 
the fovea. You also accommodate by changing your focus, tensing or relaxing your 
ciliary body to adjust the thickness of the crystalline lens inside your eye. 

Neither of these effects is a particularly robust or accurate indicator of depth 
information, since they only relay useful information for objects very nearby. When 
you are looking at an object more than about 6 meters away, the ciliary body is at 
its most relaxed state, and your eyes are effectively converged on infinity (looking 
straight ahead). Thus, for 6 meters and beyond there are basically no oculomotor 
cues that contribute to depth perception. 
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The other major category of depth cues are the visual cues, which are distinguished 
into the binocular and monocular classes, depending on whether they involve two 
eyes or one. 


1.5.3 Binocular Dopth 

When two eyes are involved in a vision task, it is termed a binocular activity. The 
ability to make depth judgements based on information from binocular vision is 
called stereopsis . Stereopsis can provide very precise information on the depth of 
objects in a scene. 

For example, suppose you hold two pencils vertically about 1 meter from your 
eyes. Stereopsis makes it possible for you to see a 1-mm disparity in the distances of 
those two pencils; that’s a rather remarkable precision of 1 unit in 1,000. 

To perceive depths based on binocular information, the visual system needs to 
perform two tasks that are (at least conceptually) distinct. The first is to match 
features in the two images, followed by a calculation of their retinal disparity , or 
relative displacement in the retinal images. 

We can imagine that feature matching begins with feature extraction , or finding 
significant objects in both images, followed by feature correspondence , which iden¬ 
tifies like features in the two images. An example of this is suggested by looking at 
a room full of books; the first stage of processing would identify each book-shaped 
blob in each image as a “book.” Two red books may then be put into correspon¬ 
dence. Though it seems reasonable, this theory can be easily disproved. 

This famous demonstration makes use of a random-dot stereogram , as shown in 
Figure 1.31. When you view these images as suggested in the caption, directing one 
image only to each eye, neither eye sees any of the other image. Since the images 
are made simply of black and white dots, there are no common features to extract 
and then merge; any black dot could match any other black dot. Yet when properly 
viewed, a very distinct 3D structure with two layers is revealed. The experiment may 
be repeated with more complex shapes and a larger number of identifiable layers. 

The random-dot stereogram puts to rest the idea that the visual system first 
extracts features from the individual images at the eyes, and then later matches those 
features in the brain. After all, there are no features in these drawings to be matched! 
The identification process must be somewhat more complex. It may be interesting 
to note that infants as young as four months, as well as monkeys, cats, and falcons, 
appear able to see the effect. A variant on the random-dot stereogram is the single¬ 
image random-dot stereogram (SIRD). A SIRD is a repeating band of vertical texture, 
where the dots have been displaced horizontally as a function of their depth. Some 
people can deliberately cross their eyes and line up adjacent copies of the bands, so 
that dots shifted in one band appear over unshifted dots in another band; the brain 




FIOUKI 1.31 

Random-dot stereograms. To see the stereogram, it may help to place a piece of paper between 
your eyes, so that each eye sees only one image. Try to illuminate both sides equally. Relax your 
focus and attempt to fuse the two images. You’ll find that a part of the image appears to float in 
front of the background. 
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interprets these shifts as changes in depths as in the regular random-dot stereogram 
and a depth pattern encoded into the shifts can be made to emerge. 

The principal characteristic distinguishing the images in the two eyes is retinal 
disparity . This refers to the lateral separation of the two images—the fact that some 
features are shifted on one retina with respect to other features. Although we have 
seen that features are not extracted and then matched, there must be some sort of 
matching process going on in the visual system, since we do perceive one complete, 
3D world, rather than two similar views at all times. 

The complete explanation of depth perception and binocular image combination 
is not known, but it appears that the physiology of the visual system and the brain 
plays a very large role in resolving retinal disparity to create a unified image of the 
world. There seem to be cells that are specifically designed to find matches between 
particular parts of each retina. When these cells find a match, the depth of the point 
of fixation may be used to help determine whether the object under scrutiny is closer 
or farther than the focus point. 

As with the rest of the visual system (and the entire human body), stereopsis is 
both remarkably robust and fragile. If any of the many steps involved in stereopsis 
are not satisfied, then a person is said to be stereoblind . Rather than tolerate two 
competing or unresolved images, the visual system seems to select one image for 
presentation to the rest of the brain, and suppresses the information coming from 
the other eye. The choice of which eye’s image to process may be fixed, or may 
change, depending on the individual. 


1.5.3 Monocular Dopth 

Several depth cues can be extracted from a single image; these are known as monoc¬ 
ular depth cues. There are two general categories of such cues: static cues that can 
be extracted from a single scene, and dynamic cues that require several images over 
a period of time. We will look at static cues first. 


Interposition 

The first cue we will examine is known in the vision community as interposition , and 
in computer graphics as visibility . Computer graphics has a tradition of generating 
this cue using hidden-surface removal techniques. The simplest of these techniques, 
the painter’s algorithm , simply renders all the objects in the image one by one, 
working from the farthest to the nearest, overwriting any previous information in the 
image. The interposition cue is how we understand such a scene: if object A occludes 
object B, we assume that A is nearer than B. Interposition is very powerful; if in an 
experiment a subject is shown a scene in which retinal disparity and interposition 
cues contradict each othei; the interposition cues will win out. 
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FIOURI 1.32 

An example of the size cue. 


Sizo 

The cue called size summarizes our experience that larger objects seem closer than 
smaller ones. Even in an abstract set of objects, such as the squares in Figure 1.32, 
the largest object appears to be closer than the others. We also seem to have a notion 
of familiar size ; if you see a friend’s face, you can quickly estimate how far away that 
person is because you know roughly the actual size of his or her face. 

Size may be responsible for the famous moon illusion . For a person on Earth 
looking directly at the moon without additional optical instruments, the moon may 
be considered to always have a fixed radius and a constant distance from Earth. 
Therefore the visual angle subtended by the moon is a constant, and we might 
imagine that the moon should always appear the same size. For thousands of years 
observers have reported that the moon appears bigger when it is near the horizon 
than when it is high in the sky [72]. This phenomenon seems to be common to 
all cultures and ages. A complete answer to the moon illusion is still elusive, but 
it probably depends on a number of perceptual cues being combined unconsciously 
to cause different estimates of the moon’s size in different surrounding situations. 
The heart of the problem is that when the moon is low to the ground and visible 
behind common objects, we interpret it as part of that scene and apply our normal 
experience of Earth-based vision to interpreting the distance of the moon. That is, 
we mistake the size of the moon when it is near the horizon because it appears in 
close proximity to many other, familiar objects. 

The argument is based on the idea that when the moon is high in the sky, we have 
no reference points, and because our normal range of visibility is typically only a 
few kilometers or less, we unconsciously assume the moon is at about this distance, 
underestimating its actual distance. Since we know that things appear smaller as 
they get farther away, we then underestimate the size of the moon to make it agree 
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MOURI 1.33 

The moon illusion. 


with our underestimate of its distance. When the moon is on the horizon, we can 
compare it to familiar objects such as buildings and trees, and we unconsciously 
revise our distance estimate so the moon is farther away. But the retinal image of 
the moon hasn’t changed size. So if the moon is farther away, but its image is not 
smaller; the moon itself must be larger, as in Figure 1.33. 

This explanation is far from the last word on this long-standing illusion [72], but 
it suggests that some distance and depth cues may involve a sophisticated blend of 
experience and judgment. 


P*rsp«ctlir« 

The depth cues classified as perspective phenomena all deal with perceived changes 
of physical structures with distance. Perspective is a natural result of the small pupil 
that acts as the entry gate to our visual system. You could think of the pupil as a point 
through which all light must pass, creating a perspective projection . Perspective is 
not the only way to project a 3D world onto a 2D surface, but it is the one with 
which we are most familiar in our daily lives. 

Perspective may be used to fool us deliberately. Across the United States there are 
some famous tourist attractions that advertise themselves as located on “gravitational 
anomalies” or “physical impossibilities” [30]. Generally, the visitor is taken on a 
tour through one or more buildings where balls appear to roll uphill, people become 





40 


THE HUMAN VISUAL SYSTEM 




PlOUtl 1.34 

The Ames room: a forced-perspective illusion. 


shorter and taller as they walk from one door to the next, and trees seem to grow 
at an angle. These are almost always forced-perspective illusions, where the normal 
visual cues of perspective are amplified and distorted so that we are presented with 
a consistent visual argument that defies our previous experience. Perhaps the most 
famous example of such an illusion is the Ames room , shown in Figure 1.34. Many 
science museums have an Ames room in which you can experiment; it is fascinating 
that even when you know exactly how the illusion is constructed and the principles 
on which it is based, the visual argument is still compelling. 

Linear perspective is the geometric variety of perspective that is most familiar in 
computer graphics. It is the phenomenon whereby objects appear to get smaller as 
they get farther away. The diminishing size of railroad track ties as they recede is 
the classic example of this effect. 

Texture gradient perspective tells us about depth by the change in the size, color, 
and spacing of objects with distance. Figure 1.35 shows an abstract example of this 
type of perspective. Sharp discontinuities in the texture field can suggest edges and 
corners. 

Aerial perspective (or atmospheric perspective ) accounts for the effects of inter¬ 
vening media such as fog and smoke, which are more pronounced upon the image 
of an object as that object recedes. As light from an object is scattered through the 
medium, it loses saturation and can be hue-shifted; contours and sharp edges are 
also diffused. Objects that are farther away are seen less clearly than those nearby. 
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MOURI 1.35 

An example of texture gradient. 


1.5.4 Motion Parallax 

The last cue we will examine depends on motion. As we move our heads, the relative 
position of objects appears to move as well; this is called motion parallax . 

The field of apparent motion is not uniform. To see this, fix your gaze at some 
point not too far away and then move your head to the right. Objects nearer than 
the fixation point will appear to move to the left; those farther away will appear to 
move to the right, as in Figure 1.36. 

In general, objects closer than the fixation point will move in the opposite di¬ 
rection of your head motion and those farther than the fixation point will move in 
the same direction as your head. In both cases, the apparent speed of the motion 
increases with distance from the fixation point. You can confirm this easily by clos¬ 
ing one eye, holding up two fingers at different distances, fixating on one and then 
moving your head. 

Motion is relative, and motion parallax will occur if your head is still but the 
object is moving. A simple but very effective demonstration uses a large tree. View 
the tree with one eye when the air is calm, around noon when there are few horizontal 
shadows and the trunk appears to be a flat shade of brown; the tree will appear flat. 
But when the wind picks up and the leaves move, suddenly the tree will acquire an 
easily perceived depth. 

These two types of parallax are sometimes distinguished with the terms head- 
motion parallax and object-motion parallax. 
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The apparent visual flow in head-motion parallax. 


Note that motion parallax and retinal disparity seem to present the same infor¬ 
mation in two different ways. Why should we have two such sensitive means for 
determining depth? One answer may come from the different times when these skills 
are useful. Motion parallax is useful when a predator is moving quickly, chasing 
after moving prey. When a predator is searching for prey, it may be useful to stay as 
still as possible; in this case retinal disparity would be very useful. 


1.6 Color Oppononcy 

It is interesting to consider how color information is propagated from the photore¬ 
ceptors to the brain. Just as we saw the effect of the surround on lightness contrast 
above, there is a phenomenon called color contrast that causes us to see colors in 
different ways, depending on their surround. 

This is not a new idea. Consider what Leonardo da Vinci [113] had to say about 
it around A.D. 1500: 

Of several colours, all equally white, that will look whitest which is against 
the darkest background. And black will look intense against the whitest back¬ 
ground. 

And red will look most vivid against the yellowest background; and the same is 
the case with all colours when surrounded by their strongest contrasts. 
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MOIIRI 1 .37 

A schematic view of color opponency. 


This observation is described by a theory called color opponency. The model for 
this theory is sketched in Figure 1.37. 

The basic idea is that color information is transmitted from the eye to the brain 
along three nerve bundles, or channels. The information along each channel is not 
simply the values of the three retinal photoreceptors. Rather, each channel carries 
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a sum or difference of the color information derived from the photoreceptors. The 
sum of the responses from the M and L cones is transmitted along the achromatic 
channel A = M + L; this carries only black-and-white (or intensity) information. 

It has been suggested [472] that early, primitive sea creatures developed this 
achromatic channel first, as a basic intensity-only response to light. As animals 
became more complex, a chromatic channel developed, primarily to differentiate 
sky and water from earth and vegetation. A second channel then developed to 
provide a further refinement in the ability to distinguish colors. 

One chromatic channel carries the difference between the M and L cones. Since 
these correspond roughly to green and red, this is called the red-green chromatic 
channel. In symbols, R/G = M — L (note that the channel’s conventional name, 
i?/G, does not imply the ratio of R to G). 

A second chromatic channel makes use of the S photoreceptors. This carries the 
difference between the S information (roughly in the blue region) and the achromatic 
channel (roughly yellow). So this second channel is called the blue-yellow chromatic 
channel: B/Y = S - A. 

This suggests that colors get transmitted in a 3D space with axes of intensity, 
red-green, and blue-yellow. This is why a color may be reddish yellow, but never 
reddish green. This is easily verified: if you project red light onto a white screen, 
and add green light, the sum appears yellow, not greenish red. 

This theory also suggests why some colors appear more saturated than others; for 
example, a yellow appears less saturated than a red or blue [389]. This comes about 
because a hue will appear desaturated if it creates a strong achromatic response, and 
at least some response in one of the chromatic channels: it’s the ratio of the chromatic 
to the achromatic response that predicts how saturated a color will appear. 

This is only a rough description of color contrast and color opponency, but it 
should suggest that our perception of color depends on many factors, such as the 
color of the surrounding environment, and that our visual system doesn’t allow us 
to perceive certain color combinations. 

These observations have important implications when we design and choose 
colors for image synthesis. The surrounding field of every color must be considered 
if we want to present a particular color. 

1 .7 Perceptual Celer Matching: CIE XYZ Space 

As we have seen above, the response of the visual system to incident light depends 
on the different adaptations made by the physical components of the eye. In fact, 
that is barely the tip of the iceberg: the many additional layers of physical and 
psychological processing each have their own mechanisms for reacting to different 
forms of light input and image structure, and thereby affect the overall response of 
the visual system. 
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Although a further study of the human visual system is fascinating and rewarding, 
we will not need to explore this field deeper for this book. But keep in mind that 
there are many factors to be considered when evaluating how an observer will react 
to a particular visual input. Some of the additional problems include: the frequency 
distribution and intensity of the background illumination, the size of the target (or 
image), the intensity and frequency of recent stimuli, fatigue, age of the observer, 
and even nutrition. 

Given the complications, it may seem hopeless to attempt to find some single 
way to describe “color” in terms of human perceptual response. We may be able to 
create a laser that radiates at 555 nm, and call it “red,” but how do we determine 
if an observer would call it “red”? And given the enormous range of influences on 
the visual system and its response, might someone call this laser “red” today but 
“green” tomorrow? 

As we know from experience, the situation is not that bad. In practice, most peo¬ 
ple have no trouble differentiating “red” from “green” on a reliable basis; achieving 
this consistency is probably the purpose of many of the correction and adaptation 
mechanisms we have discussed. But a single objective standard would be a very 
useful context in which to discuss color. We could then discuss different observers 
with respect to how they differ from an objective, standard observer. 

A set of standard conditions for measuring human response to color was de¬ 
cided upon by the CIE (Commission Internationale d’Eclairage). Under these test 
conditions, a number of color matching experiments were performed. 

One result of these experiments was the observation that any perceived color 
could be generated by some combination of three well-chosen light sources. This 
is almost certainly a result of the fact that our eyes contain three different types of 
cones, each sensitive in a different frequency range. 

Conceptually, the experiments proceeded as follows. Three particular light 
sources were chosen and projected on the left side of a white screen, so they over¬ 
lapped and their colors added together, as in Figure 1.38. Subjects were seated in 
front of this screen, and given a knob to control the intensity of each of the three 
sources. Then on the right side of the screen a single “target” color was shown, and 
the subject was asked to adjust the knobs of the three sources until the mixed color 
matched the target color. The lights were arranged so that the intensity of each of 
the three source lights could be dialed to any number between +1 and -1. At +1 the 
light was fully on, at 0 it was fully off, and at -1 the color was “subtracted” from the 
composite; this was achieved by instead adding it to the target (this was necessary 
in order to match all colors). This matching experiment was run for every spectral 
color, and the three source values were recorded. The results of this experiment for 
one set of source lights, simply called r, g , and fc, are shown in Figure 1.39. These 
three lights were almost monochromatic , that is, almost completely made up of a 
single pure wavelength; in this case r = 700 nm, g = 546. lnm, and b = 435.8 nm 
[489]. Although each person’s responses are different, after enough trials we can 
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The r, and b color-matching curves. Redrawn from Wyszecki and Stiles, Color Science , fig. 4- 


p. 124. 




1.7 Perceptual Color Matching: CIE XYZ Space 


47 


o 

I 

a 


-T3 



400 500 600 700 

Wavelength, X (nm) 


FlOUftl 1.40 

Two different spectra that appear the same. Redrawn from Wyszecki and Stiles, Color Science , 
fig. 6, p.126. 


average the results and attribute them to a hypothetical standard observer. This has 
been a very simplified account of color matching; more details are available in many 
reference texts, such as Wyszecki and Stiles [489]. 

One surprising result of the color matching experiments is that very different 
spectra can evoke the same perceived color. Figure 1.40 shows two spectra, each 
of which cause observers to report the same perceived color. Different spectra 
that give rise to the same perceived color under some set of conditions are called 
metamers . In fact, any perceived color may be matched by an infinite number of 
different metamers. This has important implications for image synthesis: if we 
wish to represent the arbitrary color of one or more objects in a synthetic scene 
with a spectral energy distribution, we may choose from the infinite possibilities 
any metamer we like. Often this will be the one most convenient for storage and 
computation. 

Because of the practical difficulties in working with control values that are some¬ 
times negative, the CIE defined three new hypothetical light sources, with spectra 






MOURI 1.41 

The T(A), y( A), and 1(A) color-matching functions for the 2° standard observer (solid curve) and 
the 10° standard observer (dashed curve). 


designated x(A), y( A), and 1(A). The matching curves for these functions across the 
spectrum are shown in Figure 1.41; note that all values are always positive. 

To predict how much of each source would be needed to match an arbitrary 
input color C(A), you add together the necessary amount of each component at each 
wavelength. Mathematically, this is simply the integral of the input and the source. 
In other words, to “match” the color C(A), we find how much of each of the three 
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standard sources we need to add together to create a color perceptually equivalent 
to C(A). Thus to match C( A) using 

C( A) = Xx(X) + Yy{ A) + Zz( A) (1.1) 

we find the weights X, F, and Z from 


x = 

[ C(X)%(X)dX 


J\€H V 

Y = 

[ C(X)y(X)dX 


J AG7?.v 

Z = 

f C(X)J(X)dX 


J Ag7£v 


(recall that f Ae1Zv stands for f™ 380 when the visual band is taken to be 380 to 
780 nm). 

In effect, this standard defines a 3D linear space of colors, with respect to a 
particular coordinate system called CIE XYZ space. This 3D color space is awkward 
to work with directly. It is common to project the space onto the plane X+Y+Z = 1. 
This results in a 2D space known as a chromaticity diagram . Figure 1.42 shows this 
plane including the 3D XYZ locus for visible colors, demonstrating the projection of 
the solid onto the XY plane. The coordinates in this projected 2D plane are usually 
called x and i/, derived from the 3D values by the relations 

X 

x + f + z 

Y 

V ~ X + Y + Z 

z ~ 'x+T+~z = l ~ x ~ v <U) 

The plane of Figure 1.42 is shown in Figure 1.43. The curve in Figure 1.42 is 
based on using targets that subtend a 2° angle from the observer; this is often called 
the xy triangle for the 2° standard observer. Since the three x(A), 17(A), and 1(A) 
matching functions are all positive, all colors lie within the convex shape created by 
the horseshoe curve forming the top two legs of the “triangle.” We may then simply 
draw a line connecting the two ends of the horseshoe, and thereby define a closed 
convex shape which contains all colors. The shows the 

subset of colors that may be displayed on a typical CRT monitor. 

The standard observer is a useful myth. In practice, each person has a slightly 
different response to color, influenced by many environmental and psychological 
factors. 





HOUR! 1.43 


A chromaticity diagram. Redrawn from Silverstein, Color and the Computer , fig. 2-4, p. 33. 


Perception may also be altered deliberately. During World War II, the United 
States Navy wanted to use infrared signal lights to send signals that would be invisible 
to everyone but the intended recipient. Unfortunately, United States seamen were as 
blind to infrared as everyone else, so the signals were useless [389], To overcome 
this, Navy scientists observed that all retinal photopigments include vitamin A. They 
hypothesized that by feeding the sailors a diet containing a chemical form of vitamin 
A different from that in a normal diet, they might be able to influence the character 
of the photoreceptive cells. For several months volunteers were fed a diet low in the 
usual form of vitamin A but rich in an alternative chemical form. The experiment 
appeared to be working, but an electronic device capable of sensing infrared was 
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The spectral locus. 


developed at about the same time, and the Navy cancelled the dietary experiment. 
Nevertheless, the results show that a change in diet can influence the perception of 
color. 


1.8 Illusions 

Optical illusions have contributed a lot to our understanding of the visual system. 
They serve to isolate and demonstrate effects and phenomena that we usually either 
take for granted or are unaware of. The classic compendium of visual illusions is 
Luckiesh [277]. More recent catalogs of illusions may be found in the references in 
the Further Reading section. 

We present a few illusions here to illustrate that there can be a large disparity 
between the mechanistic description of an image and its perception. It can be 
easy to forget this when working in computer graphics; there is a temptation to 
believe that if one performs an accurate physical simulation of light physics, with 
appropriately stable numerical methods and signal processing, then the final result 
is an “accurate” image. The definition of accurate in this case does not include the 
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PIOURI 1.44 

Subjective contours. 


observer, since people often “see” objects, contours, and relationships in images that 
are not explicitly part of the image. 

For example, consider Figure 1.44. In these images most observers perceive a 
triangle whose corners are suggested by the cutaway black dots. The visual system 
fills in the rest of the contours of the triangle, even if the edges are not straight; these 
are called subjective contours . 

Many famous illusions place equal-size objects in different contexts, with the 
result that they appear unequal. The Muller-Lyer illusion in Figure 1.45 shows two 
horizontal lines of equal length, one bracketed by inward-pointing arrows, the other 
by outward-pointing arrows. The line enclosed by inward-pointing arrows usually 
appears longer. One explanation of this effect is that the arrows appear to suggest 
two intersecting planes. The inward-pointing arrows suggest an angle that is concave 
from our point of view; for example, we are looking into the junction between a wall 
and a ceiling from inside a room. The outward-pointing arrows suggest a convex 
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The Miiller-Lyon illusion. Both horizontal lines are the same size. 


angle; the outside of a box, for instance. Observers perhaps assume that the concave 
intersection is farther away than the convex one, since it appears to be receding. Since 
both horizontal segments have the same length, we “know” from perspective that 
the farther-away line must therefore be larger [389]. This argument is not certain, 
but it suggests the type of high- and low-level phenomena that probably combine to 
create some illusions. 

A similar illusion is shown in Figure 1.46; the two inner circles are the same size, 
though they usually don’t appear that way. The explanation for this illusion is even 
more tenuous. 

Humans tend not to be particularly good at estimating absolute quantities, partic¬ 
ularly the magnitudes of angles. In general, small angles tend to be overestimated and 
large angles underestimated [358]. Professional magicians know that these errors in 
judgement can be enhanced by additional visual cues; thus, a magician’s assistant 
may “disappear” from a clear tank that is sitting on a base that is “obviously” too 
small for the assistant to have curled up into. Our perception of the size of the 
base is misguided by color and shape cues that are carefully designed to force us to 
underestimate its true size and shape. 

Other classic illusions include “impossible figures,” where we are presented with 
a planar projection of a 3D shape that is locally logical but globally inconsistent. 
Famous examples include Penrose’s impossible tribar and endless staircase, in Fig¬ 
ure 1.47. 
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PIOIIRI 1.46 

The two inner circles are the same size. 



PI O U ft I 1.47 

Two illusions found by Roger Penrose, (a) The impossible tribar. (b) The endlessly ascending (or 
descending) staircase. 
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1.9 Further Reading 

Much more on the human visual system may be found in standard textbooks and 
advanced research works. In particular, Sekuler’s textbook [389] is a good source 
for a general introduction, and WandelPs textbook [463] provides more detail and 
some basic mathematical structure. There is a lot of classic wisdom in Leonardo 
da Vinci’s notebooks; an excellent low-cost, unabridged, and illustrated two-volume 
translation is available from Dover [113]. Further information on the anatomy of the 
visual system may be found in volumes 2 and 4 of Davson [117,118]. An excellent 
brief survey of the visual system may be found in the IES lighting handbook [355]. 
Resnikoff presents a look at perception from an information-theory point of view 
[358], Some numerical information on the various pans of the physical system are 
given by Wyszecki and Stiles [489]. 

A discussion of impossible-figure illusions and a great variety of examples may 
be found in Ernst [137]. Discussions of illusions in general appear in Luckiesh 
[277], as well as Lanners [257], Gregory [171], and Sekuler [389]; the latter two 
contain modern descriptions of the theories that have been put forth to explain some 
illusions. Gregory in particular presents a very interesting discussion on the relation 
between perception and awareness. 

A description of the many processes involved in spatial vision may be found in 
DeValois and DeValois [123]. The visual systems of other animals are surveyed in 
detail in a lavishly illustrated volume by Sinclair [412]. A general discussion of the 
visual system and some philosophy about its relation to our development may be 
found in Gregory [171], 

The dependence of the visual system on the direction of a grating was originally 
reported by Taylor [433]. The classic paper on the implications of photoreceptor 
packing patterns on the retina is Yellot’s paper of 1983 [496] on the monkey retina. 
The work of Williams and Collier on the human retina [479] suggests that there may 
be similarities between the monkey and human retina. The particular types of noise 
that the human visual system is willing to tolerate were studied and characterized by 
Huang [218]. 


1.10 bards** 

IxwcIm 1.1 

Fire trucks used to be painted red. Now many new fire trucks are yellow-green. 
Why? 

IxotcIm 1.2 

If you open your eyes underwater, it is difficult to see well, even if you have normal 
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a—-a Falcon 
o—o Macaque 
► Owl monkey 
♦—• Cat 
«j Goldfish 



PIOURI 1.48 

CSFs for different species. Redrawn from DeValois and DeValois, Spatial Vision , fig. 5.2, p. 150. 


vision. But if you put on a face mask before diving, your vision is about as good as 
it is out of water. Why? 

Ixorciso 1.3 

Consider the variety of contrast sensitivity functions shown in Figure 1.48. Suppose 
you were part of a psychology team preparing images to be shown to falcons. What 
are the implications for your rendering system? Considering the CSF as the only 
change between falcons and humans, could you produce images more quickly, or 
would it take more time? How about for a goldfish? 

IXtNiM 1.4 

One common method for printing 3D figures is to print two different pictures on 
top of each other, one each in red and green inks. Then a pair of glasses is supplied, 
with a red transparent filter over one eye and a green filter over the other. Thus the 
eye covered with the red filter perceives only the green part of the drawing, and the 
other eye perceives the red. If the two figures are drawn as though seen from the 
two eyes, a properly adapted viewer will see a 3D figure. Why do you think red and 








Fieuftl 1.49 

An absorption curve that’s 0 in blue, about 1/3 in green, about 1 in red. 

green are the most common choice of colors for the inks? Why are the same colors 
used for the filters? Would other colors work as well, or better? 

IxerciM 1.5 

What evolutionary factors do you think may have been involved that led to the eye 
focusing at infinity when at rest ? 

IxtrciM 1*6 

Design some images where different depth cues are inconsistent. Determine a relative 
ranking for the importance of different cues to the human visual system. 

ImtcIm 1*7 

Colored glasses are popular both in mythology and in practice. 

(a) Would image synthesis be easier or faster if everyone wore rose-colored 
glasses? 

(b) Some firms sell sunglasses (sometimes called “blue-blockers”) that block ul¬ 
traviolet and even some visible-blue light. Would you expect these glasses 
to actually improve any aspect of your vision in any specific and measurable 
ways? Explain. 

ImtcIm 1*8 

Suppose you had a sheet of plastic with the response curve given in Figure 1.49. 
How would the daytime world look through a sheet of this material? How about 
through two sheets? Three? 







If the resolution of our vision were as poor as 
the resolution of our olfaction, when a bird flew 
overhead the sky would go all birdish for us for 
a while. 

Daniel C. Dennett 
(“Consciousness Explained," 1991} 



COLOR SPACIS 


2.1 Perceptually Uniterm Celer Spaces: L*u*v* and L*a"fr 

The XYZ color space is not a very intuitive space. It is difficult to interpret the mean¬ 
ings of the values for X and Z, though Y was designed to represent the brightness 
of a color. In addition to an intuitive interpretation of the axes, an “ideal” color 
space would be perceptually linear: the distance between any two points measures 
how “alike” they look. Such a space can make some computations easier. 

For example, consider interpolation. If we wish to interpolate from color A to 
color B, we might write C = (1 - a) A + aB and sweep a from 0 to 1. This is the 
typical way that Gouraud and Phong shading are implemented. We would probably 
like equal increments of a to result in steps of C that were of perceptually equal sizes. 
Unfortunately, this does not happen in XYZ space: equal steps along the path from A 
to B do not produce perceptually equal steps in the color of C. Figure 2.1 shows this 
phenomenon. It shows the results of a color-matching experiment. Conceptually, 
two colors of equal luminance were shown to observers and then one was changed. 
The observer was asked to report when the change was visible. Each ellipse is a 
region of constant color (the ellipses are magnified for visibility). The important 
observation here is that the ellipses are not the same size or in the same orientation. 
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PIOURI 2.1 

The MacAdam ellipses. Redrawn from Pratt, Digital Image Processing , fig. 3,7-2(a), p. 86. 


Thus, a particular magnitude of shift in color space at one point may be undetectable, 
but the same shift applied to a different color would be quite visible. 

To overcome this problem the CIE defined two new, alternative color spaces, 
called L*u*v* and L*a*b*. Both of these spaces, based on the XYZ space, were 
designed to be perceptually uniform. Figure 2.2 shows the result of the ellipses in 
the u*v* plane. Note that they are much more uniform than in Figure 2.1, though 
they are still not perfect. 

Another nonlinear transformation has been proposed [139] to make the color 
space even more uniform; the MacAdam ellipses in this space are shown in Figure 2.3. 
Though the uniformity is much better, the computation is much more complex than 
for the L*u*v* or L*a*b* systems, as discussed in Pratt [345]. 

Each space is defined with respect to a reference white color (X n , Y n , Z n ). Usually 
the reference white is one of the CIE standard illuminants, scaled so the Y n value is 
100. Both spaces use the same definition of L*: 
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M0URI 2.2 

MacAdam’s ellipses in a perceptually linear space. Redrawn from Pratt, Digital Image Processing , 
fig. 3.7-2(b), p. 86. 



Y/Y n > .008856 116 


Y/Y n < .008856 903 


©" 

•ft) 


- 16 


( 2 . 1 ) 


Note that L' = 100 for the reference white, when Y = Y n . In fact, L" may be 
considered to measure the “lightness” of the color. The conversion between XYZ 
and L'u’v* is given by Wyszecki and Stiles [489]: 


u* = 13 L*(ti'-t*n) 
v * - 13 L*{v' -v' n ) 


(2.2) 
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PIOURI 2.9 

MacAdam’s ellipses in Farnsworth’s nonlinear transformation. Redrawn from Pratt, Digital Image 
Processing, fig. 3.7-3, p. 87. 


The variables in Equation 2.2 are given by 


4X 

X + 15Y + 3Z 
9 Y 

X + 15Y + 3Z 

4X„ 

X n + 15 y n + 3 z n 

9 y n 

X n + 15 y n + 3Z n 


(2.3) 


A plot of the spectral colors in L*u*v* space is shown in Figure 2.4. The solid in 
the center is the region occupied by the colors reflected by objects that are illuminated 






2.1 Perceptually Uniform Color Spaces: L*«V and L*a*b' 
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FI0URI 2.4 

Sketch of the L*u*v* color space. Redrawn from Wyszecki and Stiles, Color Science, fig. 1(3.3.9), 
p. 166. 


by the CIE standard illuminant D 65 ; it is the region within which the distance formula 
in Equation 2.10 is intended to be valid [489]. 

The L*a*b* space is another perceptually based color system that is sometimes 
used instead of L*u*v *. The L*a*b* space is based on ANLAB(40), a color system 
in wide use in the textile industry. The value for L* is the same as in Equation 2.1. 
The other variables are given by 




(2.4) 


A plot of the spectral colors in L*a*b* space is shown in Figure 2.5. As in the 
L*u*v* picture, the solid in the center is the region occupied by the colors reflected 
by objects that are illuminated by the CIE standard illuminant Z> 65 ; it is the region 
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M0URI 2.5 

Sketch of the L*a*b* color space. Redrawn from Wyszecki and Stiles, Color Science, fig. 2(3.3.9), 
p. 167. 


within which the distance formula in Equation 2.10 is intended to be valid [489]. 
Note that the spectral color curve has a kink at around 570 nm. 

Each of the ratios given above is passed through a function / before it is used. 
The function usually takes the cube root of its input. For numerical precision and 
stability, values below a certain threshold are approximated linearly: 

f( v / r > .008856 r 1 / 3 ^ 

f(r) \ r <.008856 7.787r + 16/116 1 ' 


Just as L* corresponds to lightness (or the value transmitted along the achromatic 
channel in the visual system), the a* axis corresponds to the red-green channel and 
the b* axis to the blue-yellow channel. 

To recover the XYZ coordinates of a color from either L*u*v* or L*a*b* requires 
an inversion of the mapping process. The inverse relation for Y is the same in both 
spaces: 

Yn / L* + 16 \ 3 

100 V H6 J 


( 2 . 6 ) 
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To recover X and Z from it* and v*, we first define a few temporary variables Q, R, 
and A to help decouple the relations: 

Q=^7 + < R= ^ +V n A = 3Y(5R-3) (2.7) 

With these definitions, we find 

Note that if L* = 0, then X and Z are undefined. It is traditional in such cases to 
set X = Z = 0. 

Recovery of X and Z from a* and 6* is rather more direct: 


X = 

Z = 





Neither of these two color spaces is perceptually completely uniform, though they 
are close. Work continues on developing more uniform spaces. The choice of which 
of these two spaces to use probably doesn’t matter as much as making sure one of 
them is used consistently. 

By design, the Euclidean distance between any two colors A and B in either 
perceptual color space may be computed from the magnitude of the vector between 
the colors: 


Kv = \J( L A - L bf + ( U A - u b ) 2 + ( V A - v hf 

Kb = \J(Ka " + («A " + (*A - *>b ) 2 < 2 - 10 > 

One particularly important feature of these spaces is that two pairs of colors with 
the same distance metric are almost perceptually equally similar or different. 

These spaces do admit an intuitive interpretation. Think of either space as a 
cylindrical coordinate system, with L* acting as the main axis of the cylinder, and 
the other coordinates representing a point in the plane perpendicular to this axis. 
The L* axis represents the “lightness” of a color. Given a value of L*, the plane 
through the color point perpendicular to the L* axis defines a 2D system based on 
(u*,v*) or ( a*,b*). Intuitively, the angle around this plane represents the hue of the 
color, and the distance from the L* axis represents the saturation. More formally, fi, 
the CIE 1976 hue-angle, is given by Hunt [219]: 

h uv = tan ~ l (v*/u*) 
h a h = tan -1 (6* /a*) 


( 2 . 11 ) 
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and C*, the CIE 1976 chroma , is given by 


= EE 


( 2 . 12 ) 


Much of the color computation in computer graphics has historically been done 
in RGB space (discussed in more detail in the next chapter). For example, Gouraud 
shading blends the color at the vertices across the face of a polygon. Typically 
this blending is carried out by linearly interpolating the RGB coefficients of the 
colors separately. Although this technique was chosen for convenience and speed 
rather than theoretical accuracy, it usually seems to work acceptably well. This may 
be somewhat surprising when we recall two of the problems that the perceptually 
uniform spaces were designed to cure: equal steps in RGB space are not perceptually 
equal color steps, and we might pass through colors on the way from one point to 
another that intuitively don’t seem to be “in between” the two endpoints. 

Figure 2.6 (color plate) shows the equal-step linear interpolation of two colors in 
RGB space and the corresponding colors in XYZ and L*u*v* space. We show these 
points plotted in both RGB and L*u*v* spaces in Figure 2.7. 


2.2 Other Color Systems 

Many applications of computer graphics require the use of accurate color represen¬ 
tations of natural objects. A blade of grass, a piece of obsidian, and a tin can all 
have specific reflectivities that may be carefully measured. The best way to describe 
these colors is probably with some form of spectral radiant power distribution (i.e., 
a complete spectrum). Each material is described by much more than simply a single 
color; we will consider more complete material descriptions in Chapter 15. 

There are also times when it is useful to create a new color: for example, when 
creating textures to apply to surfaces or images that depart from reality. It is useful 
to have access to convenient color representations for color design in such situations. 

As mentioned above, the XYZ system, though a useful reference, is not an intuitive 
space in which to design colors. 

The RGB (red-green-blue) color cube, shown in Figure 2.8(a), is not much better 
than the XYZ space for color calculations. It is difficult to find any particular color, 
and once located, it is difficult to adjust that color. Classic examples of both of these 
problems are to ask a user to find brown, and then once found, make a lighter shade 
of brown. 

The L*u*v* and L*a*b* spaces have an intuitive interpretation as a roughly 
cylindrical color space. In effect, the L* axis controls the lightness of the color. Each 
cross section is a polar coordinate system with the angle controlling hue and radius 
controlling saturation. A user may be given control over each of the three values. 



PIOURI 2.7 


The interpolation of two colors in equal steps in RGB and L*u*v* color spaces, (a) In RGB space, 
(b) In L*u*v * space. 



PIOURI 2.B 

Several different color spaces. Redrawn from Hall, Illumination and Color in Computer Generated 
Imagery , fig. 3.1, p. 46. 
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Formulas for converting between these spaces and XYZ are given above. Interactive 
navigation through one of these spaces is not easy. 

The next four color spaces are very similar to each other and the perceptually 
uniform spaces just mentioned. Each has a lightness axis and represents saturation by 
distance from that axis and hue by angle around the axis. Each is defined with respect 
to the particular monitor’s RGB . Thus, when communicating a color designed with 
one of the following systems, you must specify the monitor’s phosphor chromaticities 
in order to interpret the coefficients. It is probably better to convert the RGB values 
to XYZ, since that provides a universal, device-independent representation. 

The HSV (hue, saturation, value) hexcone is shown in Figure 2.8(b). The central 
axis carries the gray values from black at the bottom to white at the top. The 
conversion between RGB and HSV is a short procedure [147,181]. 

The HSL (hue, saturation, lightness) double hexcone is shown in Figure 2.8(c). Its 
difference from the HSV hexcone is that the level of maximum available saturation 
is at L = 0.5 rather than L = 1.0. The HSL double cone in Figure 2.8(d) is similar 
to the HSL double hexcone, except that the cross section is circular rather than 
hexagonal. The HSL cylinder in Figure 2.8(e) is like the HSL double cone, except 
that the complete radius is available at all points along the L axis. 


2.3 Further Reading 

More information on color descriptions may be found in the standard text on color 
systems written by Wyszecki and Stiles [489]. An extensive table for converting 
among different color standards used in image processing and broadcast is presented 
in Pratt’s book on image processing [345]. Hall [181] has much to say about color 
systems and their effective use, and provides source code for converting between 
color systems. Foley et al. [147] summarizes some of these and presents algorithms 
for converting between color spaces. 


3.4 Exorcises 

Ixercise 2.1 

Build a color picking system using the L*a*b* or L*u*v* color space. How easy it is 
to use? Compare it to an RGB system. 

IxerciM 2.2 

Mixing light is an additive color system (red -I- green + blue = white), rather than a 
subtractive system. Why do you think this is so? 
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Kxtrcist 2.3 

(a) Interpolate the color C\ = (.2, .3, .3) to C 2 = (.8, .9, .7) in RGB space in ten 
equal steps. Convert the RGB value at each step to L*a*6*, and find the 
distance between each successive pair of points in L*a*b* space. 

(b) Convert C\ and C 2 to L*a*b*> and interpolate them in ten equal steps in that 
space. Convert each interpolated L*a*b* value to RGB, and find the distance 
between each successive pair of points in RGB space. 

(c) Discuss your results. Suggest two situations where RGB interpolation is 
appropriate, and two where it is not. 

Ix«rcise 3.4 

Many of the intuitive color systems in this chapter use a cylindrical or conical 
coordinate system. Design an intuitive model based on a spherical system. What 
does it mean to interpolate colors in your system? Can you come up with a good 
distance metric? Do you think a color system built on a toroidal coordinate system 
would be a good idea? Why or why not? Do any other geometries suggest themselves 
to you for color selection? 





The portrait had altered. ... That such a 
change should have taken place was incredible 
to him. And yet it was a fact. Was there some 
subtle affinity between the chemical atoms, that 
shaped themselves into form and colour on the 
canvas, and the soul that was within him? 
Could it be that what that soul thought , they 
realized?—that what it dreamed, they made 
true? 

Oscar Wilde 

(■'The Picture of Dorian Gray," 1891) 
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3.1 Introduction 

The principal display devices in use today are light-emitting and light-propagating . 
The distinction resides in where the light comes from: either the display itself or else¬ 
where. Light-emitting displays include CRT and LED displays. Light-propagating 
displays include print media, transparencies (including slides), and LCD panels. 
Each type of display has many variations, and new alternatives are being developed 
rapidly. In this section we will focus our attention on the CRT because it is one of 
the most common devices used for creating images. 

Each type of device also has many variations in the geometry of its component 
color elements. For our CRT discussion we will emphasize the triangular lattice of 
phosphors, though alternatives abound [298]. 


3.2 CRT Displays 

A typical color CRT (cathode-ray tube) is shown in schematic form in Figure 3.1. At 
the neck of the tube are three electron guns , each of which emits a narrow stream of 
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A schematic view of a CRT. 


electrons. Each stream passes through a pair of deflection coils that exert an electrical 
force on the electrons and bend the beam from side to side and up and down. 

The inside of the front face of the tube is coated with a pattern of blobs of 
three different types of chemicals known as phosphors . A phosphor is a chemical 
material that radiates light of a characteristic color when struck by electrons. Up 
to a material-determined limit, striking a phosphor with more electrons causes it to 
glow brighter. Many different types of phosphors that emit different colors have 
been discovered. Most CRTs use red, green, and blue phosphors, which can form 
a basis for a useful region of color space. We will have much more to say about 
phosphors in Chapter 14. 

The arrangement of the phosphors on the inside of the tube varies, but one 
common setup is to place one small dot of each phosphor on the vertex of an 
equilateral triangle, as shown in Figure 3.2 [298]. Phosphors are usually described 
not just by color but also by persistence : how long they continue to glow after 
absorbing a burst of electrons. A long-persistence phosphor will glow for a longer 
period of time than a short-persistence phosphor. If you have seen television tubes 
that appear to leave a “streak” behind fast-moving objects, this is probably due to 
overly long phosphor persistence. 
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B 

Phosphors 


Electron guns 


Shadow mask 


riOURI 3.2 

The arrangement of electron guns and phosphors. 


Typically, each of the three guns is dedicated to creating an electron beam that only 
strikes phosphors of a single color. So although each gun is simply an electron emitter, 
they are often called the red, green, and blue guns, identifying the color of phosphor 
that the electron beam eventually strikes. Since the three beams are deflected in 
unison, they are sometimes referred to as “the beam,” the three components being 
distinguished only when necessary. 

To ensure that each beam strikes the correct phosphor in each triplet, a shadow 
mask is usually placed just behind the phosphors. The mask is an opaque screen 
that has holes only where the beam needs to pass through to reach a phosphor. The 
mask and the geometry of the beam angle serve to limit the beam to the intended 
phosphor. 

To create an image, the beam is deflected in unison to sweep the entire face of 
the tube. Starting at the upper left (viewed from outside of the front face), the beam 
is moved across the screen to the upper right. As the beam moves into position to 
strike a particular triplet, the video signal coming into the monitor input specifies 
the color to be displayed at that point as a linear combination of red, green, and 
blue intensities, matching the phosphors on the screen. The intensity of the electron 
beam at each gun is modulated to match the specified intensity, which in turn causes 
each of the three phosphors struck by the beam to glow with the specified intensity. 
Then the beam is moved to the next triplet, the correct color intensities are fed to 
the guns from the video signal, and the process repeats. 
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0#00#00#00#00#00#00#00 Phosphors 

BRGBRGBRGBRGBRGBRGBRGB 



FIOURI 3.3 

Beam spread illuminates several phosphors. 


Although in this discussion we spoke of “the triplet” to which a beam is aimed, 
in fact the beam is typically much wider than a single phosphor triplet. The beam 
itself has a profile as shown in Figure 3.3, so the phosphors near the beam center will 
glow most brightly and those to the sides less so [298]. The granularity of the dot 
spacing on the shadow mask (called the pitch of the mask) is typically in the range 
of 0.2 to 0.6 mm. The shadow-mask pitch is usually not the limiting factor on CRT 
resolution; this is usually due to the electron optics or the bandwidth of the video 
signal. 

Many factors can cause the beam to stray from perfect alignment with the phos¬ 
phors. These include assembly variations, stray magnetic fields from the environ¬ 
ment, or the effect of heat inside the tube causing various parts to expand. To reduce 
the required precision, some CRTs are designed so that the phosphor dot is larger 
than the expected projection of the electron beam, as in Figure 3.4(a). The beam thus 
has some tolerance for both horizontal and vertical movement, and the energy will 
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PI O U It I 3.4 

The idea of the guard band, (a) Conventional CRT. (b) Black matrix CRT. Redrawn from Merrifield 
in Color and the Computer , fig. 3-11, p. 72. 


land on the desired phosphor. If the beam moves too far, then a nearby phosphor of 
another color will be illuminated, reducing the precision of displayed edges. 

In some environments, there may be enough ambient light in the room where the 
CRT is viewed that the image on the screen will appear faded and the colors less 
pure. Solutions to this problem generally involve somehow increasing the perceived 
contrast. This may be accomplished with directional viewing screens, or angle- 
restrictive filters , which can take the form of a thin or thick honeycomb placed over 
the front of the CRT. This blocks ambient light from the side from reflecting off 
the face of the CRT, but it also propagates that ambient light that impinges on the 
screen. Another solution is to reduce the size of each phosphor and place it on some 
light-absorbing material such as carbon black, as in Figure 3.4(b); typically, the mask 
aperture is enlarged slightly at the same time. The beam now surrounds the phosphor 
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and again there is some margin for misplacement. Everywhere on the face of the 
tube where there is no phosphor; the background material will absorb the ambient 
illumination. The amount of light each phosphor can put out is reduced because of 
its smaller size, but the gain in contrast is sometimes worth the trade-off [298], 

Other contrast enhancement relies on the optical properties of various materials. 
For example, phosphors may be impregnated with pigments that afes^rfe light near 
the phosphor’s emission range and absorb all other light, effectively absorbing the 
ambient illumination. This of course reduces the light emitted of the phosphor, 
since some of its energy is being absorbed inside its own material, but the relative 
proportions of the materials may be adjusted over a wide range to achieve a desired 
contrast [267]. Another approach to increasing contrast involves a neutral-density 
filter . This is a filter placed over the front of the CRT that uniformly reduces the 
energy of all wavelengths of light passing through it. The reason this improves 
contrast is because the intensity of the ambient light is usually much lower than the 
intensity of the emitted light from the CRT, so reducing them both eliminates the 
ambient light while still leaving a fraction of emitted light. The environment, filter 
choice, and intended use of the display determine what fraction of attenuation is 
called for. Finally, a selective filter may be placed over the screen. Figure 3.5 shows 
the spectrum for didymium glass, along with the emission bands for some generic 
red, green, and blue phosphors. Dymidium glass passes these wavelengths better 
than others, so it works to attenuate some background radiation. 

In general, monochromatic CRTs are capable of sharper focus, thinner lines, and 
brighter output than color CRTs. This is because the shadow mask blocks most of 
the beam energy in a multispectral tube, so that its achievable luminance is about 10 
to 20% of that achievable from a high-output monochromatic CRT [298]. 


3.3 Display Spot Interaction 

There are many types of phosphor geometries used for CRTs. We will use as an 
example a triangular lattice of clusters, where each cluster contains one phosphor 
each of red, green, and blue, as in Figure 3.6. 


3.3.1 Display Spot Profile 

We further assume that each piece of phosphor may be modeled as a point-source of 
light, with a circularly symmetrical emission p that assumes a Gaussian form. That 
is, the intensity p at each point (x, y) on the screen (when the illumination is at the 
origin (0,0)) is given by 

p(x, y) = e~^ x +y ^ = e~ r 


(3.1) 


PIOURI 3.5 


The spectrum of didymium glass. Redrawn from Merrifield in Color and the Computer , fig. 3-15, 
p. 75. 



PIOURI 3.6 

A triangular phosphor geometry. 
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It will be useful to define R as that radius where p(R) = 0.5. Then 

p(r) = e~ R2 

or, solving for R , 

R= y/- ln(0.5) 


(3.2) 

(3.3) 


3.3.2 Tw- Spot Interaction 

We are interested in the sum of many display spots on the screen. Following the 
ideas in Castleman [77], we consider each spot pi to have center C \. Then for any 
point P = (x, t/), we may write the cumulative intensity D(P) as 

W = 2>( p " c *) (3.4) 


Equation 3.4 requires finding the contribution of every dot on the screen. The 
Gaussian p(r) decreases monotonically with r, so we expect that at some cutoff 
radius r = r c , we can consider the contribution from a g iven spot to be negligible. 
For any threshold r = p(r c ), we find r c = y/— ln(r). We will suppose that a 
contribution of 1% is small enough to be negligible. When the spot drops to an 
intensity of r = 0.01, we find r « 2.15. Thus if spots are closer together than about 
2.15 times the radius of the Gaussian, they will sum with each other. We will consider 
spots farther away than 2.15 times the radius from any point to have a negligible 
contribution at that point. Call the interspot distance d. As d increases, two adjacent 
spots interact less. 

Figure 3.7 shows the value of D(d/2) for two fully on spots as d increases from 
0 to 4 R; that is, we are looking at how much light comes from a point midway 
between the centers of the two spots as one moves away. We would like our field of 
white to have value 1.0 everywhere, so we watch the sum of the dot contributions at 
this particular point arbitrarily and find the distance where the two spots sum to 1; 
we get a value of D(r) = 1 at d = 2i?. 

This suggests that an interspot spacing of d = 2 R may be the most desirable, 
since we would like a flat field of fully on spots to have the value 1 everywhere. 
Figure 3.8 shows the amplitude of the field from one spot center to the next at this 
spacing. The total intensity at each point x measured from one center is given by 
D(x ) = p(x) + p(2R - x). At x = 0 and x = 2f?, D( 0) = D(2R) = 1.0625. The 
lowest value is at x = i?, where D(R) = 1, as expected. So the response isn’t quite 
flat, though the variation is only 6.25% of the amplitude we would prefer. 

This analysis has only considered the interaction of two spots. We will get a 
better idea of how spots interact if we consider the entire local neighborhood for 
several different patterns. We will do this now. 
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PIOIIRI 3.7 

The field £>(0.5) halfway between two spots versus their distance d. 



PIOURI 3.3 

The field D(r) between two spots for different r using d = 2 R. 


3.3.3 Display Meawir ease nt 

In the next few sections, we will look at the contrast , C, of several different patterns. 
Contrast may be defined as 


a max-“mm 

contrast —-— 

max + min 


(3.5) 


(other definitions include max / min and (max — min)/ min). 
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MOUII 3.0 

Positions for a, /?, and 7 points. 


To determine the contrast for different spot patterns, we will examine the bright¬ 
ness of the field at three points: the center of a spot (which we call an a-type point), 
a point midway between two spots (a /?-type point), and a point midway between 
three spots (a 7 -type point), as in Figure 3.9. 

To find the amplitude at each of these points, we need to find all the spots that 
contribute some light to that point. Recall our cutoff above of 2.15 radii; thus, we 
need only concern ourselves with spots that have centers within this radius of the 
point being evaluated. 

The analysis is based on the geometry of the phosphor pattern. Figure 3.10 shows 
the geometry for an a-type point, positioned directly over a spot center. Working our 
way outward, we find that because of symmetry there are only four unique types of 
phosphor centers that contribute: the spot the test point is on (5), and those labeled 
A, B, and D in the figure. Since we have set our interspot spacing to d = 2 B, the 
cutoff for contribution by a spot to this test point is 2.15d. The circle in the figure is 
drawn at r = 2.15 d. We also include one layer of centers outside the circle to confirm 
that we have included all the appropriate centers. Table 3.1 gives the distances. 

Thus the brightness for spot a may be written by summing the Gaussian response 
from each spot (using Equation 3.3 evaluated at the correct distance). For each 
pattern, each contributing spot will have an associated weighting factor w(Si) of 
either 0 (if cell 5» is off), or 1 (if it’s on). We can write the final intensity of a (that 
is, D(a)) as 


6 6 6 

D{a) = w(S) + Y w{Ai)p{d A ) + Y w{Bi)p(d B ) + Y w(Di)p(d D ) 

t=l i =1 i= 1 

6 6 6 

= w(S) + Y w(Ai)e~ d + Yi ™(Bi)e- 4d + Y w( A)e' 3d 

t=l 1=1 1=1 


(3.6) 
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Geometry for centered spot. 


Spot type 

Distance {units of d) 

Significant? 

5 

ds = 0 

n/ 

A 

d A = 1 

V 

B 

ds = 2 

V 

C 

d c =3 


D 

d& — \/3 ~ 1.732 

V 

E 

d E ^ V 2.646 



TA1LI 3*1 

Spot distances for a. 
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We can carry out the same analysis for the other two patterns. The distances for 
P and 7 , shown in Figures 3.11 and 3.12, are given in Tables 3.2 and 3.3. 

We use the same capital letters in all three patterns for convenience; because each 
analysis has a unique number of elements, there should be no confusion about which 
position is intended by which letter. 

The corresponding equations for ft and 7 spots are given by 

222 

D(0) = Y wiAMdA) + Y tv(Bi)p(d B ) + Y, MDMd D ) 

2=1 2=1 1 = 1 

4 4 

+ Y w ( E i)P( d E) + Y w ( F i)P( d F ) 

2=1 2 = 1 

= Y w{Ai)e- dli + Y ™(Bi)e ~ 9d/4 + Y w ( A)e“ 3d/4 

i=l 2=1 i=l 

+ Y w(Ei)e- 7d/4 + Y w(Fi)e~ 13d/4 (3.7) 

1=1 2=1 


and 


3 6 3 6 

£> ( 7 ) = Y w ( A i)p( d A) + Y w ( B i)P( d B ) + Y W ( c i)p( d c) + Y W ( F i)P( d r) 

2 = 1 2 = 1 2=1 2=1 

= ^ w(Ai)e ~ d / 3 + ^ ^ w(Ci)e~ 4d ^ ^ w(f7)e _13d/3 

2=1 2 = 1 2=1 2=1 


(3.8) 


3*3.4 Patttrn DturlpHtn 

Each pattern of on-and-off phosphors may be described by a characteristic ce//, or 
cluster, which is a small group of pixels that is simply translated across the screen 
to generate the pattern. Because we are interested in the contrast ratios for different 
patterns, we will only consider black (off) and white (fully on) pixels. Each pattern 
will be characterized by a value r, defined as the ratio of the number of white to 
bl&gk pixels in the cell that describes that pattern: 

_ number of white pixels ^ ^ 


We will consider values of r from 0 to 1. 
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MOURI 3.11 

Geometry for /3-type points. 



MOURI 3.13 

Geometry for 7 -type points. 
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Spot type 

Distance (units of d) 

Significant? 

A 

d A = 1/2 = 0.5 


B 

d B = 3/2 = 1.5 

v 

C 

d c = 5/2 = 2.5 


D 

d D = x/3/2 as 0.866 

v 

E 

d E = x/ 7/2 as 1.323 

v/ 

F 

d F = VT3/2 a: 1.803 


G 

d G = \/l9/2 as 2.179 


H 

d H = v^r/2 as 2.291 


J 

dj = s/31/2 as 2.784 


K 

d K = 3x/3/2 as 2.598 



TAIL! 3.2 

Spot distances for (3. 


Spot type 

Distance (units of d) 

Significant? 


d A = Vl/3 as 0.577 

v/ 

B 

d B = Vf/3 a; 1.528 

v/ 

C 

d c = VA/3 as 1.155 

V 

D 

d D = n/ 19/3 as 2.517 


E 

d E = \/l673 a: 2.309 


F 

d F = \/l3/3 a: 2.082 

sj 

G 

d G = \/28/3 as 3.055 


H 

= \/2bj3 as 2.877 
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TABLI 3.3 

Spot distances for 7 . 
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3.3.5 Thf Uniterm Black Pluld (r = 0) 

The uniform black field is shown in Figure 3.13(a). In this trivial pattern, every point 
(x, y) has the same intensity of 0 : D (a) = D(/?) = D( 7 ) = 0.0. This is a perfect response 
for this pattern; all three of our samples accurately display the pattern intensity 0 . 0 . 


3.3.6 Clusters of Pour (r = .25) 

A fundamental cell of four pixels, with one white pixel, is shown in Figure 3.13(b); 
it has a density of r = 1/4 = .25. There are four cells in the pattern, so there are 
four different places to put an a-type test point. These four choices are shown in 
the left-hand column of Figure 3.14. Similarly, there are four places to put a /3-type 
point (in the middle column) and four places for a 7 -type point (right column). The 
weights for the various spots around each center are given in Table 3.4. 

Using the weights in Table 3.4 and Equation 3.6, we can write the intensity for 
each center: 


D(a 0 ) = 1 4- 6 e“ 4d 
D(a 1 ) = 2e~ d + 2e~ sd 
D(a 2 ) = 2e~ d 4 - 2 e~ 3d 

D(a 3 ) = 2e~ d + 2 e~ 3d (3.10) 

The last three positions of a are equivalent, since they have similar neighborhoods. 
Ideally, D(a 0 ) should be one and the other three should be zero. 

We can make the same analysis for the (3 positions; there are again four of them. 
The weights are summarized in Table 3.5. Using the weights in Table 3.5 and 
Equation 3.7, we can write the intensity for each center: 

D{/ 3 0 ) = e " 3d/4 4 - 2e~ 7d/4 
D((3 1 ) = e~ 3d/4 + 2e~ 7d/4 
D{(3 2 ) = e“ d / 4 4- e~ 9d / 4 + 2e~ l3d ^ 4 

D(/3 3 ) = e~ d ' A + e ~ 9< */ 4 4 - 2e“ 13d / 4 (3.11) 

Again we notice that, due to symmetry, the expressions for (3 0 and f3\ are equal, and 
so are those for (32 and /3 3 . 

Finally, we can carry out the same analysis for the 7 class of points. The weights 
are summarized in Table 3.6. With Equation 3.8 and Table we can write the 
values for each position of the 7 points: 



MOURI 3.13 

Contrast and average test patterns, (a) The uniform black field, r = 0.0. (b) Clusters of four, 
r — 1/4 — 0.25. (c) Clusters of two, r = 1/2 = 0.5. (d) The uniform white field, r = 1.0. 








FIOURI 3.14 


Centers for the clusters of four, a-type points are in the left column, /3-type points are in the middle 
column, and 7 -type points are in the right column. 
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a weights for each cluster of four. 
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0 weights for each cluster of four. 
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TABLI 3.6 

7 weights for each cluster of four. 
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D( jo) = e~ d/3 + 2e~ ld/3 + 2e~ 13d/3 
D{~n) = e _d/3 + 2e ~ 7d/3 + 2e _13d/3 
D( 72 ) = 3e" 4d / 3 

D( 73 ) = e -d/3 + 2e " 7d/3 + 2e " 13d/3 (3.12) 

We can now compute various intensities on the screen for this test image for 
different values of d . Figure 3.15 shows the values for the four different a-type 
points as d varies from R to 3 R. Similarly, Figures 3.16 and 3.17 show the variation 
for 0 and 7 points for the same range of d . 

We can now compute the contrast on the screen with respect to this set of posi¬ 
tions. For each value of d, the brightest and darkest points are the minimum and 
maximum of Figures 3.15, 3.16, and 3.17. These are shown in Figure 3.18, along 
with the contrast value computed from them. 

Notice that the contrast improves as the spot spacing increases. This argues that 
for the best contrast, the spot centers should be as far apart as possible. 

Our calculations allow us to compute some other interesting properties as well. 
Consider the average intensity of the image. The ideal would be a value of 25%. 
If we average the four values for just the a positions, we find a range of averages 
shown in Figure 3.19. 


3.3.7 Clusters of Two (r = .5) 

A striped pattern may be generated by a fundamental cell of two pixels, with one 
white pixel. One example is shown in Figure 3.13(c); it has a density of r = 1/2 = .5. 

We will now apply the same analysis as above to this pattern. Figure 3.20 shows 
the cell neighborhoods. There are only two of each type of point in this size cluster. 
The weights for the two a-type spots are given in Table 3.7. 

Using the weights in Table 3.7 and Equation 3.6 we can write the intensity for 
each center: 


D(a 0 ) = 1 + 2e~ d + 6 e _4d + 2e~ 3d 

D(a\) = 4e~ d 4- 4e~ sd (3.13) 

The weights for /?-type points are summarized in Table 3.8. 

Using the weights in Table 3.8 and Equation 3.7 we can write the intensity for 
each center: 


D(0 0 ) = e~ d / 4 + e~ 9d / 4 + e- 3d < 4 + 2e " 7d / 4 + 2e " 13d / 4 
D(/3i) = e~ d / 4 + e~ 9d / 4 4- e“ 3d/4 + 2e~ 7d ' 4 + 2e " 13d / 4 


(3.14) 




PIOURI 3.17 

The intensity of 7 points with respect to changing d for the cluster of four, (a) £>(70), £>(71), an ^ 
D(7 3 ). (b) D{ 72). 











FIOURI 3.18 

Min, max, and contrast for the cluster of four for different values of d. 



FIOURI 3.19 

The average intensity of the cluster of four for different values of d . 
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PIOURI 3.20 

Centers for the clusters of two. 


Again we notice that, due to symmetry, both expressions are equivalent. 

Finally, we can carry out the same analysis for the 7 class of points. The weights 
are summarized in Table 3.9. 

With Equation 3.8 and Table 3^8 we can write the values for each position of the 
7 points: 


£>( 70 ) = e~ d/3 + 2e~ 7d/s + 3e " 4d/3 + 2e " 13d/3 

D{ 71 ) = 2e~ d/s + 4e ~™ /3 + 4cT 13d/3 (3.15) 

The corresponding intensity plots for this pattern are shown in Figure 3.21 for a-type 
points and Figures 3.22 and 3.23 for (3 and 7 points. 

As before, we can now compute the contrast on the screen with respect to this set 
of positions. For each value of d, the brightest and darkest points are the minimum 
and maximum of Figures 3.21, 3.22, and 3.23. These are shown in Figure 3.24, 
along with the contrast value computed from them. 

Again, notice that the contrast improves as the spot spacing increases. 

The ideal average intensity of this image would be a value of 50%. If we average 





FIOURI 3.31 

The intensity of a points with respect to changing d for the cluster of two. 



FIOURI 3.33 

The intensity of 0 points with respect to changing d for the cluster of two. 



FIOURI 3.33 

The intensity of 7 points with respect to changing d for the cluster of two. 
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TABLI 3.7 

a weights for cluster two. 
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0 weights for cluster two. 
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TABLI 3.9 

7 weights for cluster two. 


the four values for just the a positions, we find a range of averages as shown in 
Figure 3.25. 


3.3.8 Tfio Uniform Whito Mold (r = 1) 

The uniform white field is shown in Figure 3.13(d). In this field every pixel is on. 
There is only one position for each type of point. We can write the equations for the 










3.3 Display Spot Interaction 



MOURI 3.24 

Min, max, and contrast for the cluster of two for different values of d . 



MOURI 3.2S 

The average intensity of the cluster of two for different values of d. 


different spot positions by inspection. 


D(a) = 1 + 6e~ d + 6e _4d + 6e _3d 

D{p) = 2e" d/4 + 2e _9d/4 + 2e _3d/4 + 4e“ 7d/4 + 4e _13d/4 

D( 7 ) = 2e- d / 3 + 2e" 7d / 3 + 2e~ 4d ' 3 + 4e" 13d / 3 


(3.16) 
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PIOURI 3.26 

Min, max, and contrast for the white field for different values of d. 



PIOURI 3.27 

The average intensity of the white field for different values of d. 


Contrast and average curves are shown in Figures 3.26 and 3.27. Notice that in 
this case we want the contrast to be as low as possible, since we want a flat, white 
field where every point in the image is identical. 








3.4 Monitors 


97 


3.3.9 Spot Interaction Discussion 

The essential point to notice from the above discussion is that there is a natural 
tension on the distance between dots. To achieve a flat, uniform white image the 
dots should be close together, so there is very little black space between dots. But 
for high contrast, the dots should be far apart, so that one does not bleed into the 
next, and a spot that should be off has an intensity near zero. 

This tension must be balanced by the designer of the display; different choices 
may be appropriate for different images. Although hardware manufacturers set the 
dot spacing for CRTs, many tools for printing allow the user to set dot spacing and 
other parameters as part of the imaging process. 

The interdot spacing sets an upper limit on the precision with which we can 
represent detail in an image. The apparent dot spacing is a function of the physical 
spacing on the device and the distance between the device and the viewer, discussed 
in more detail below. 


3*4 Monitors 

Earlier we presented a description of the physical construction of a CRT. We now 
enlarge our view to include the driving electronics that control the beams; the com¬ 
posite device is called a monitor . 

Recall that the beam is swept top to bottom, left to right. When the beam 
reaches the upper-right corner of the screen at the end of the first row, or scan 
line , it moves back to the left side of the screen to start the next horizontal sweep . 
During this interval, called the horizontal retrace , the beam is blanked : the electron 
emission is set to zero, so no phosphors are affected. This interval is needed so 
that the deflection circuitry can have time to update its charge, so the beam will 
be appropriately positioned when it is turned on again. When the deflectors have 
settled to aim the beam at the far left, it is turned on and the sweeping from left to 
right starts again. 

If the monitor is noninterlaced , then the second scan line swept out is directly 
beneath the first. Thus if the monitor displays pictures with a vertical resolution 
of 525 lines, then the order of lines swept out is 1, 2, 3, ..., 525, as shown in 
Figure 3.28(a). When the beam reaches the bottom right, it is again blanked and 
then moved back to the upper left, during the vertical retrace . In the United States, 
a complete video image is usually swept out in about 1/30 second. 

On the other hand, if the monitor is interlaced , then the image is built up by 
first displaying all of the odd scan lines, then all the even, so the order of lines 
would be 1, 3, 5, ..., 525, 2, 4, 6, ..., 524, 1, as shown in Figure 3.28(b). 
This requires an additional vertical retrace for each picture after the final odd scan 
line. The first set of lines is called the odd field , the second the even field . Most 
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FI O II R I 3.21 

Two raster scan patterns, (a) Noninterlaced, (b) Interlaced. 


commercial monitors in the United States display images in interlaced format, since 
commercial broadcasts use the NTSC standard, which specifies an interlaced signal. 
Some industrial monitors offer both modes, with the choice controlled by a switch 
or a computer signal. 

An advantage of interlaced display is that the likelihood of flicker is reduced. As 
we saw in Chapter 1, a rule of thumb is that under most viewing circumstances, 
the flicker rate is about 30 frames per second, so a noninterlaced monitor will often 
appear to flicker (consider a scan line 1/3 of the way down the screen; it is only 
refreshed every 1/30 second). Of course, an interlaced monitor also displays only 30 
complete frames per second, but alternating the fields effectively doubles the display 
rate. To see this, again consider a scan line 1/3 down the screen: although after it is 
swept it will not be swept again for 1/30 second, the scan lines immediately above 
and below it are drawn 1/60 second later. Since we saw earlier that phosphors are 
typically close enough to affect each other, we do not see a set of black bands where 
the scan lines are at their oldest (almost 1/30 of second). The persistence of the 
phosphor also helps sustain the steady emission of light from that scan line until it 
is revisited. 

Most monitors provide a pair of controls called brightness and contrast for the 
user to adjust. Figure 3.29 shows a diagram of how these controls affect the signal 
driving the electron guns. Figure 3.30 is a curve showing the intensity of emitted 
light plotted against the voltage applied to the guns. Note that below a cutoff voltage 









FIOURI 3.30 


The phosphor light emission plotted as a function of input voltage. 


V coy no light is emitted by the tube. For voltages V greater than the cutoff, the light 
intensity follows an exponential curve of the rough form k(V — V co y. The exponent 
7 is usually between 2 and 3; we will have more to say about it in a moment. 

The contrast control adjusts the amount of amplification of the signal; the more 
the signal is amplified, the brighter it will appear on the screen. The brightness 
control is a bias adjustment, adding some fixed voltage into the video signal before it 
reaches the guns. Note that this moves the response curve left and right, not up and 
down; this is important. The brightness setting then simply changes the minimum 
amount of video signal necessary to cause the screen to emit light. The brightness 
control is normally set to - V co , so that 0 volts of signal is black, but any positive 
signal is visible. It is the control normally called “contrast” that adjusts the overall 
brightness of the image by boosting the intensity of the visible parts of the signal. 

Typically we desire a monitor’s response to be linear with input signal: if we 
double the signal, we would like to double the energy of the emitted light (notice 
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that we are discussing radiant energy, and not brightness perceived by a human 
observer; the latter has its own nonlinearities). For example, if we compute one 
pixel to have a gray value of 100 , and another a gray value of 200 , we would expect 
the latter pixel to emit twice as much light as the former. But as we saw a moment 
ago, the intensity response curve is exponential, rather than linear, so doubling the 
input intensity does not double the output light. Symbolically (and assuming that 
brightness is set to - V co ), 2 k(V 7 ) ± k(2V) 1 . To compensate, we typically adjust 
the input signal before sending it to the monitor. Since the nonlinear response of 
the monitor is described by the exponent 7 (gamma) in the response curve, this 
compensation is usually called gamma correction . Thus instead of sending V to 
the monitor, we send V 1/lf ; then 2k (V 1/7 ) 7 = k (( 2 V) 1/7 ) 7 . In broadcast video, 
gamma correction is performed before the signal is transmitted, so most monitors 
expea the signal to already be corrected. 

The usual range of gamma for color receivers is 2.8 ± 0 . 3 . Typically for video 
display, full gamma correction is not applied; instead, the video signal V is usually 
raised to about 1 / 2.2. The result is that the final image has an increase in gamma 
over the original input signal by a faaor of about 1.27. This intentional error 
is introduced to compensate for the reduction in apparent contrast caused by the 
dim surround conditions in which a monitor is normally viewed. Unfortunately, 
this also causes colors to increase in purity and shift in their dominant wavelength. 
The increase in purity may be beneficial in some circumstances, when it serves to 
compensate for an apparent decrease in saturation of the colors due to the dim 
surround conditions. Unfortunately, the shift in dominant wavelength will cause 
small shifts in hue, as shown in Figure 3.31. 

In modern display systems the gamma is often fine-tuned by setting a compen¬ 
sation curve into the color map [78], though this must be done carefully [181]. 
Alternatively, the pixel values themselves may be precorrected [50]. 

The colors displayed on a monitor can be affected by many different phenomena, 
only some of which can be controlled [62,78]. Even the magnetic field of the Earth 
can affect the focusing and deflection of the elearon guns [146], to the extent that 
moving a monitor from the Northern to the Southern Hemisphere can cause a shift of 
as much as 3 mm in the display’s center. Several manufacturers align their monitors 
in different magnetic field environments depending on the destination of the CRT. 
Even rotation within the field can cause a change in defleaion; one company always 
calibrates its monitors while facing east [146]. 


3.S RGB Color Spaco 

In Chapter 2 I stated that red, green, and blue (RGB) are often used in computer 
graphics as the basis of a color space. This is an important observation that has 
many praaical results. The most obvious result here is that we can create any color 



PI O If RI 3.31 


The shift in x,y coordinates due to gamma correction. Redrawn from Hunt, Reproduction of 
Co/or, fig. 19.10, p. 395. 


by an appropriate combination of red, green, and blue. This is why those particular 
types of phosphors are used in the construction of CRTs. 

But “red” is an imprecise specification of a color. Precisely what “red” is used 
in a CRT? The choice of phosphors must be carefully considered by a monitor 
builder. For example, people are very sensitive to skin colors. If the broadcaster 
specifies a particular color at some point on the screen to represent a skin tone, 
this cannot just be specified by some combination of “red, green, and blue”; one 
manufacturer’s choices of which colored phosphors to use may differ significantly 
from another manufacturer’s, resulting in very different final colors on the screen. In 
practice, most phosphor sets used in CRTs today are similar but not identical, and 
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TAIL! 3.10 

y coordinates for standard CIE phosphors and Dqsoo white point. Source: Data from Hall, 
Illumination and Color in Computer Generated Imagery. 


they can vary even within sets of the same model produced with different shipments 
of phosphors. 

shows the subset of colors 

that can be represented by a monitor with a particular set of phosphors. The 
phosphors are at the triangle vertices and can generate anything inside by various 
linear combinations. Note that there are huge patches of color space that aren’t 
inside the triangle; these are colors we can perceive that are simply not available for 
display on this type of device. By choosing other phosphors, you may define different 
triangles and try to include more of the space. However, phosphors are complex 
compounds that must meet many conflicting criteria, such as x,y chromaticity, purity, 
persistence, stability, toxicity, and cost [267]. 

Recall from Chapter 2 that a color may be objectively described in the CIE XYZ 
color space as a linear combination of the three x(A), y( A), and 1(A) matching 
functions. In effect, this is a 3D linear space with a particular choice of axes. We 
may choose any three mutually orthogonal nonzero vectors to form an orthogonal 
set of axes (or basis) in this space. We may then specify a color with respect to these 
new vectors, and a linear transformation will take us from this set of coordinates to 
XYZ coordinates, or vice versa. 

The principal observation that supports the design of current CRTs is that most 
sets of red, green, and blue phosphors form a roughly orthogonal (or at least non¬ 
degenerate) basis in a linear color space. 

When a broadcaster creates a video signal, the color information is described 
as though all monitors used a particular set of standard phosphors. Thus, from 
the broadcaster’s point of view, the precise meaning of “red, green, and blue” is 
exactly the spectral response of the standard phosphors bearing those names. The 
chromaticities of these phosphors are given in Table 3.10 along with the white point 
for the CIE standard Z} 6 5 oo illuminant. 
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If a particular monitor was not constructed out of the standard phosphors, then 
the displayed color will be different from what was intended. Whether this distortion 
is acceptable depends on the amount by which the particular phosphors used differ 
from the standards, and the desired accuracy of the match. Some receivers have 
internal circuitry to map the incoming signal (assumed to be with reference to the 
NTSC primaries) to the particular phosphors in that tube [181]. Most industrial 
RGB monitors do not include this circuitry. 

The main point here is that it is meaningless to speak of U RGB values” without 
explicit reference to which phosphors you are discussing. Nevertheless, you often 
hear computer graphics people speak of a color in terms of RGB without reference 
to a phosphor set. They are usually implicitly referring to the RGB signal intensities 
that they feed into some particular monitor to achieve the desired color. If they do 
not specify their phosphors, then they are really not telling you much; if you use 
those RGB values on another monitor, you will probably get something like what 
they had, but you may be rather far off; there is certainly no need for much care or 
precision in matching such a loosely specified color. 

The mechanism for converting from a standard color space to the particular RGB 
phosphors in some monitor is straightforward. The following discussion will show 
the procedure with respect to XYZ color space, since transformations to and from 
that space are well known for most color descriptions. 

Our goal is to find a matrix M, which will take a three-element vector representing 
an XYZ color and transform it to an equivalent RGB vector for some particular 
monitor. We will compute M by finding N (the transform from RGB to XYZ) and 
then inverting the matrix. 

The first step is to find the chromaticities of the phosphors and white point of the 
target monitor. We will call the white spot (w x , w y ), and the red, green, and blue 
phosphors, respectively, (r x ,r y ), (g x ,gy)> and {b x ,b y ). The corresponding 2 value 
for each color may be found from 2 = 1 — x — y. 

From the phosphor triplets the matrix K in Equation 3.17 is built, and from the 
white-spot triplet the XYZ vector W is built; the latter is the color corresponding to 
the white point (X n , Y n , Z n ) scaled so that the luminance Y n has the value 1.0. We 
will also have use for the RGB white point F = (1 1 1). 


W 
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G = 


G r 0 0 

0 Gg 0 
0 0 G b 


Observe that N = GK (the RGB-to-XYZ matrix is given by the matrix of the 
phosphor XYZ chromaticities, differentially scaled so that the white-point luminance 

Y is set to 1.0). Since N relates the XYZ white W to the RGB white F, we can 
write W = FN. Substituting the previous expression for N into this equation, we 
find W = F(GK), so WK _1 = FG. Observing that V = FG, rewrite this as 

V = WK" 1 . We now have V, and from that we can build G. With this, find 
N = GK, and from that result find M = N -1 . In summary, the steps are as follows: 

I Build W and K from the monitor white spot and color phosphor chromatici¬ 
ties. 

3 Compute V = WK" 1 . 

3 From the vector V, build the matrix G. 

4 Compute N = GK. 

5 Compute M = N _1 . 

Now, to convert any XYZ space color vector C xyz into the appropriate RGB 
color Crgb for this monitor, just post-multiply C xyz by the XYZ-to-RGB matrix 
M: Crgb = C xyzM. Similarly, if you have designed a color Crgb you like on 
your monitor and you want to know its XYZ coordinates Cuse the inverse 
matrix: C xyz = C/*gbM _ 1 . The matrix M for a monitor with standard NTSC 
phosphors and a white point given by the phosphors in Table 3.10 is 


/ 1.967 

M = -0.548 

\ -0.297 


-0.955 0.064 \ 

1.938 -0.130 
-0.027 0.982 ) 


(3.17) 


3.S.1 Converting XYZ to Sp+ctra 

One problem shared by all these systems is that the resulting color does not have 
an intrinsic spectral representation. We will find it important in later sections to 
describe colors as functions defined with spectral representations, providing an am¬ 
plitude at each wavelength. We saw earlier the equations to convert such a color; 
C( A), into XYZ coordinates. But the color systems mentioned above provide only 
the XYZ coordinates; from these three numbers we wish to build a corresponding 
spectrum C(A). 

There are several techniques that may be used, each with advantages and disad¬ 
vantages. These are discussed in Glassner [155]. If you know the phosphor curves for 
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your monitor, then you may simply represent any color by the appropriate weighted 
sum of those three phosphor curves. Another reasonably useful method synthesizes 
a color by weighting the first three Fourier basis functions. 

Specifically, choose a flat spectrum Fi(A), a single cycle of a sine curve F 2 (A), 
and a single cycle of cosine Fs(\). Since these functions form a basis for all spectra 
and the conversion from spectra to RGB is linear, we may match an RGB with the 
transformed values of the spectra and use the same weights to create a new spectrum. 
In this case, build the three functions each from 380 to 780 nm: 


Fi(A) = 1.0 


fi(A) = \ 
- 5 



f n A-380A1 

1 4- sin 



^ 400 )\ 

- 

( X — 380\1 

1 4* cos | 

1 o 
o 

(M 


(3.18) 


We now wish to find the three weights W = [w\ w*i ws] with which to scale 
these spectra to build the new spectra. To find these weights we will find the RGB 
coefficients of each spectrum and store them in a matrix D. We build D from the 
XYZ components of the three spectra: 


D = 


L 

L 

L 


F 1 (X)x(X)dX 


F 2 (X)x(X)dX 


F 3 (X)x(X) dX 


L 

L 

L 


Fi{X)y(X) dX 


F 2 (X)y(X)dX 


F 3 (X)y(X)dX 


L 

1, 

L 


F\ (A)z(A) dX 


F 2 (A)z(A) dX 


F 3 {X)z{X)dX 


) 

(3.19) 

Each component of D is the result of integration of one of the basis functions with 
one of the CIE matching functions. For example, f\(A)x = fx=° 380 Fi(X)x(X) dX. 
Since D represents the XYZ coordinates of the spectra, the composite matrix DM 
represents their RGB values, where M is the XYZ-to-RGB matrix built in the pre¬ 
ceding section. Recall that M is different for each monitor’s unique set of phosphors. 
Some set of weights W on these RGB values will match the RGB color R = (r g 6) 
we have designed: 

W(DM) = R (3.20) 

We may now easily solve for W: 


W = R(DM) 


-l 


(3.21) 


The spectrum C(A) we desire is thus C(A) = w\ Fi(A) + w 2 F 2 { A) + w 3 F 3 ( A). This 
spectrum is smooth and continuous, but it may contain negative values. 
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A drawback of this process is that the curves have little relation to the monitor. 
The connection may be made stronger by using real spectra as the matching functions 
if they are available, either from the monitor’s phosphors for output, or the scanner’s 
response if we are digitizing a photograph. Although they are more closely tied to 
the color being matched, these spectra are often far from smooth. 

Color calculations with intuitive meaning are best performed in L*u*v* or L*a*b*. 
Referring back to our earlier example of linear interpolation, equal parametric steps 
in this space will result in perceptually equal steps of color. Other color calculations, 
such as finding the center of gravity of a collection of colors, filtering a set of colors, or 
even simply finding the color halfway between two extremes, are all best performed 
in one of these perceptually uniform spaces if the results need to be perceptually 
consistent and there is sufficient processing power available. 


3.6 Gamut Mapping 

Consider again Figure 1.42; it shows the triangle of colors representable on a CRT 
for a given set of three phosphors. The range of displayable colors for any particular 
device (i.e., monitor; printer, film) is called the device gamut. 

Unfortunately, not all displays share the same gamut. When some of the colors 
in an image lie outside the colors available on a particular device, we must somehow 
get the colors in the image gamut to all lie within the device gamut. This process is 
called gamut mapping. 

Gamut mapping is difficult since it involves somehow distorting the original 
picture in order to make it displayable. 

A chromaticity diagram for a typical monitor and printer is shown in Figure 3.32 
(color plate). The monitor gamut is marked with the triangle, and the printer is the 
colored region in the center. Notice that the white points of the two devices do not 
line up. Also notice that there are colors available to the printer that the monitor 
cannot represent; an image designed on a monitor is unlikely to take advantage of 
these colors. Far worse from today’s computer graphics standpoint is that there are 
many colors available on a CRT that are simply unprintable. There are missing 
regions in the greens and reds, and a great deal of unavailable color space in the 
blues. 

Proper gamut mapping for a given image is still an art [181,420]. Hall [181] 
distinguishes two types of out-of-gamut colors: those that have a chromaticity that 
cannot be matched by the device, and those that can be matched in chromaticity 
but not intensity. The first set of colors, when mapped to a CRT device, gives RGB 
values less than 0; the second set gives RGB values greater than 1. Most gamut¬ 
mapping methods assume that the input is an image that has already been converted 
to RGB for a particular monitor. 

Most gamut-mapping methods seem to fall into one of two general categories: 
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global and local approaches. A local approach examines each pixel individually 
and adjusts only those pixels that are out of gamut. A global approach applies 
information gathered from the entire picture when considering what to do with each 
pixel, including those within gamut. 

Local methods operate on each pixel independently, typically only processing 
those that are out of gamut. Some local methods involve projecting the RGB values 
into another color system, operating on the pixel there, and then returning to RGB. 
Some of these methods include the following: 

■ Scaling the pixel RGB components uniformly until it is within the device 
gamut. 

■ Scaling the intensity of the pixel but leaving its chromaticity unchanged. 

■ Desaturating the pixel leaving the hue and intensity unchanged. 

■ Clipping the pixel to the range [0,1]. 

■ Scaling the pixel nonuniformly even if it is within gamut. 

The problem with all local approaches is that they can introduce a type of error we 
call limit errors. Limit errors appear when an object suddenly changes in appearance 
from one pixel to the next due to an abrupt decision to apply a local transformation. 
For example, consider a simple clipping operation that takes any color component 
beyond 1 and sets it to 1. Figure 3.33(a) shows the color profile of the desired green 
component of a sphere across a scan line, and (b) shows the result of clipping. In 
effect, the bright part of the sphere becomes a flat sea of saturated green, and the 
object will no longer look spherical at all. Even a smooth local operation, such as 
the one shown in (c), changes the shading so that the object is no longer shaded like 
a sphere. All local approaches share these sorts of problems. 

Global approaches look over the entire picture before doing any processing. Some 
global methods include these: 

■ Finding the smallest color component in the picture (that is, the smallest value 
of R, G, or B) and calling it a. Similarly, find the largest color components 
and call it b. Display each pixel i as 


/ Ri — a Gi — a Bi — a \ 
\ b — a ’ 6 — a’ b — a j 


(3.22) 


■ Similar to the above, but compressing only that part of the input range for 
each color component that is out of range, as in Figure 3.34. Using s and d 
from Table 3.11, display each pixel as 


{d s(R{ — u), d s(G{ — o), d -4- s(Bi — u)) 


(3.23) 
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MOUII 3.33 

Local gamut mapping, (a) A desired profile for a sphere, (b) The clipped profile, (c) A less drastic 
transformation. 


■ Scanning the image and gathering statistics on all out-of-gamut pixels. Select 
a local or global technique based on this information. 

Global approaches don’t introduce the sort of limit errors created by local meth¬ 
ods, but they have their own drawback: they introduce semantic inconsistency errors . 
Consider Figure 3.35, which shows a red ball reflected in two different mirrors. For 
convenience, we will assume that the ball is a uniform bright red color, so from any 
angle it appears as a flat disk. The left mirror is 25% and the right mirror 50% 
reflective. Suppose that the brightest parts of the ball map to a red component of 
3 units (the monitor can only display values 0 to 1). So the pixels representing the 
visible ball have red value 3, pixels displaying the ball in the left mirror have a red 
value of 3/4, and pixels showing the image of the ball in the right mirror have a red 
value of 1.5. 

If we only adjust the pixels that are out of gamut, then we will affect the pixels 
showing the ball and the pixels in the right-hand mirror but not those in the left-hand 





3.6 Gamut Mapping 



noun 3.34 

The six possibilities for range compression. 



a e [0,1] 

a i [0,1] 

b e [o, l] 

d — a, s = 1 

b 

d = 0, s = 


b — a 

b f i [0,1] 

A 1 - G 

a = a, s =- 

d = 0, s = —-— 

6 — a 

b - a 


TABLI 3.11 

Selecting d and s for partial range compression. 
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FiaUftl 3.35 

A ball reflected in two different mirrors with different colored reflectivities. 


mirror. This is an example of a violation of semantic consistency: the picture no 
longer represents what it originally was meant to represent. The red ball is darker, 
the reflection in the right mirror is darker, and the reflection in the left mirror is 
unchanged. If no other objects in the scene are out of gamut, and are reflected off 
of these mirrors, then we have a strange situation where almost everything in the 
scene is telling us about the reflectivity of these mirrors, but the images of the ball 
act differently. 

A local method to mapping this picture would also fail. Suppose we simply clip; 
then the red ball and its right reflection are both at red value 1, and the left reflection 
is at 3/4. Suddenly the right mirror seems to be reflecting all of its light for the red 
ball, while reflecting only half of the light for everything else in the scene. 

When all we have to work with is a grid of RGB values, there may be no best 
solution. Stone et al. [420] suggest that a gamut-mapping technique should satisfy 
five criteria, whose importance is based on the image and the destination gamut into 
which we are mapping: 

1 The gray axis of the image should be preserved. 

2 Maximum luminance contrast is desirable. 
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3 Few colors should lie outside the destination gamut. 

4 Hue and saturation shifts should be minimized. 

5 It is better to increase than to decrease the color saturation. 

If the goal is to display synthetic images, then we can use information gathered 
during the rendering step to help in the gamut mapping, or even guide the rendering 
process so that explicit gamut mapping can be avoided altogether, preventing any 
limit or semantic inconsistency errors [160]. 

One approach is to change the Tenderer so that it creates not just RGB values at 
each pixel, but also a complete symbolic expression that completely specifies which 
colors were used to create that pixel and how they were combined [160]. Returning 
to our mirrored ball example, pixels in the right-hand mirror reflecting the ball 
would contain an expression that combines the emission of the light source and the 
reflectivities of the ball and mirror. If we have this information for every pixel, we 
can go back to the scene and adjust the scene colors until the resulting image is 
completely in gamut, effectively rerendering the image with new colors for one or 
more lights or objects. 

The resulting image is not identical to the original picture, but any gamut-mapping 
algorithm must change the picture since the original cannot be displayed. The 
advantage of this approach is that the resulting picture is entirely self-consistent: the 
image is rendered using adjusted colors, so for that set of object and light spectra, 
the image displays the rendered results, not just a displayable distortion. Another 
advantage is that the same scene may be processed several times for different gamuts, 
so a scene designed on a monitor could have its colors adjusted so that the resulting 
image includes those colors unavailable on the monitor but within the printer gamut. 
For example, a dark blue wall may become desaturated because the printer can’t 
handle that blue (thereby changing the effect of that wall on the rest of the scene), 
but a green carpet might become somewhat brighter because the printer has that 
color available. Note that the colors of some objects and lights in the scene may 
not be directly visible in the final picture, but they also may be adjusted to cause 
the entire image to come into gamut. This process may be run automatically, or a 
designer may exert manual control over which objects may change color and which 
may not. We will return to this idea in Chapter 20. 


3.7 Further Reading 

A good discussion of color and many of its applications to computer display may be 
found in Durrett’s book [133]. In particular, Merrifield’s chapter [298] contains a 
lot of information about CRTs, and Silverstein’s chapter [411] discusses many issues 
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related to the human visual system. The chapter by Andreottola [10] surveys color 
hard-copy devices, which have very different characteristics than CRTs. 

An extensive discussion of almost every aspect of phosphorescence and phosphors 
is offered by Leverenz [267], who includes chemical information, glow curves, and 
even some manufacturing suggestions. 

Colormap correction for gamma compensation was first presented to the graphics 
community by Catmull [78]. The presentation there is carried on in Hall’s book 
[181] and Blinn’s 1989 column [50]. The problem of gamut mapping is discussed 
by Hall [181], who offers some suggestions; more recent work is discussed by Stone 
et al. [420]. 

One of the first papers to consider the display as an integral part of the computer 
graphics process was presented by Kajiya and Ullner in 1981 [238]. 


3*8 Ixerci*#* 

IxotcIm 3.1 

Perform the Gaussian spot analysis for a rectangular grid. Use the center of a cell, 
the center of an edge, and the corner of a cell as the three points of analysis. 

(a) Identify the spot centers that contribute significantly to each type of point, 
using the definition in the text. 

(b) Compute the values at these three points for an all-white signal. 

(c) Compute the values at these three points for an all-black signal. 

(d) Compute the values at these three points for a perfect checkerboard signal. 

IxotcIm 3.3 

Write a program to compute spectra for a given RGB using the monitor matrix 
in Equation 3.17 and the first three Fourier basis functions. Transform the colors 
(.2, .4, .13) and (.8, .55, .45) into spectra from 400 to 700 nm in 5-nm increments. 

IxtrctM 3.3 

(a) Implement gamut mapping using clipping. 

(b) Implement gamut mapping using global scaling. 

(c) Implement gamut mapping using partial range compression of Equation 3.23. 

(d) Try out all three methods on a few pictures that are out of gamut; how do the 
results compare? Is any method fully acceptable? 

Ixtrclst 3.4 

Analyze the contrast and average properties for the two fundamental cells shown in 
Figure 3.36. 

(a) is a fundamental cell of three pixels, with one white pixel. It has a density of 
T = 1/3 = 0.3. 
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MOURI 3.36 

Figure for tiling exercise, (a) A fundamental cell of three spots, (b) A different tiling of a four-spot 
cell, (c) A fundamental cell of seven spots, (d) A fundamental cell of eight spots. 


(b) is the same fundamental cell we analyzed earlier, but with a different tiling 
pattern. It has a density of r = 1/4 = .25. 

(c) is a fundamental cell of seven pixels, with three white pixels. It has a density 
of r = 3/7 « .429. 

(d) is a cell of eight pixels, with four white pixels. It has a density of r = 4/8 = .5. 






1 1 4 


3 DISPLAYS 


ixtrcltt 3.5 

(a) Look very closely at the front of a color monitor and draw the pattern of 
phosphors you see there. Suggest why they are arranged this way. 

(b) Render on that monitor an image that is all black except for a single one-pixel¬ 
wide vertical white line, and view it from at least arm’s length on that monitor. 
What does the image look like? Look close up at the phosphor pattern and 
describe how the phosphors are being illuminated. If the phosphors are not 
lined up vertically in perfect columns, do you see a jagged line when at arm’s 
length? If not, why not? 

(c) Repeat step (b) for a horizontal line. 

KxotcIm 3*6 

Suggest a function for a local gamma-correction operation that is continuous in the 
first and second derivatives. 

IlMtlM 3.7 

Kajiya and Ullner [238] model a Gaussian bump as 

g(x) = —j= exp [-x 2 /cr 2 ] (3.24) 

<jy27r 

and suggest that the best value of a for an interdot spacing of 1 unit is a = 0 . 66 . 

(a) Plot the function f(x) = g(x) + g(l — x) between two dots in the domain 
x € [ 0 , 1 ] for a = 0.66 and a = 0.55. 

(b) Numerically integrate to find the RMS flat-field error 

E = J [1 - f(x)] 2 dx (3.25) 

between the two dots for a = 0.66 and a = 0.55. 

(c) Use your results from (b) to find the minimum for E as a function of a, and 
plot the resulting field as in (a) using the same scale. Compare the shape of 
this curve and its values to your results from (a). 

(d) In (b) we computed E using a nominal flat-field value of 1 . Was this a 
good idea? Would it be better or worse to use a nominal flat-field value of 
<?( 0 ) + #( 1 )? 

(e) Test your answer to (d) by finding the RMS flat-field error 

E d = J [d-f(x)] 2 dx (3.26) 

where d = < 7 ( 0 )+ #(1) for a = 0.66 and a = 0.55. How do the results compare? 

(f) Use your results from (b) and (e) to find the minimum for Ed as a function of 
< 7 , and plot the resulting field / as in (a) using the same scale. Compare the 
shape of this curve and its values to your results from (a). 




SIGNAL PROCESSING 


/ came to the mouth of a huge cave before which I 
stopped for a moment, stupefied by such an 
unknown thing. I arched my back, rested my left 
hand on my knee, and with my right shaded my 
lowered eyes; several times / leaned to one side, 
then the other, to see if I could distinguish 
anything, but the great darkness within made this 
impossible. After a time there arose in me both 
fear and desire—fear of the dark and menacing 
cave; desire to see whether it contained some 
marvelous thing. 


Leonardo da Vinci 




INTRODUCTION TO UNIT II 


W hat is an image? If we’re scanning the world with our eyes, then the vis¬ 
ible world is projected onto our retinas, where photoreceptors convert the 
light information into electrical and chemical information. The photoreceptors are 
densely packed on the retina, but there are only a finite number of them. So the 
information leaving the retina is really only a description of a bunch of individual 
colored dots. 

This hardly seems possible; the image of the world in our mind’s eye appears to be 
a smooth and continuous picture, and hardly a collection of colored dots. In fact our 
visual system is doing a lot of processing both before and after the transformation 
of the light into dots, not the least of which is to fuse our individual visual images 
into a coherent whole. This remarkable process creates a mental image that for the 
most part is free of the dots imposed by the photoreceptor pattern. 

Image synthesis also produces a set of colored dots: these are the color values 
on a frame buffer or other display. Because the same static image is visible for a 
prolonged period of time (as opposed to the fleeting images on our retina), side 
effects caused by this discrete representation become much more visible and thus 
more important. Many of these side effects are known collectively as aliasing. 
Even when the individual images are fine, we can experience aliasing in time for an 
animated sequence, since each frame of the sequence represents its own (discrete) 
slice of time. 

A synthetic image computed on a digital computer is a digital signal. We usually 
imagine it to be a discrete approximation of some smooth function that provides a 
color at every point on the image. The image inside a computer is necessarily digital 
by nature; the computer can only store numbers and (usually) can only compute 
with finite precision. 

To understand the nature of digital signals, we need to discuss the field of digital 
signal processing , which involves the creation of digital signals from smooth ones. 
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the transformation of those digital signals, and the process of eventually smoothing 
them out again. 

The worst problem that arises when we convert a smooth signal into a digital one 
is aliasing. The mere word can conjure up visions of jaggies, motion strobing, moire 
patterns, popping, and many other objectionable artifacts in images and animations. 
It is important that any rendering system suppress aliasing effects as much as possible. 

The first step in controlling aliasing effects is to understand the problem. Aliasing 
is a direct result of the fact that in computer graphics we work on a digital computer, 
which stores continuous signals as a collection of samples. Intuitively, it seems 
reasonable to believe that if you have enough samples of a signal (such as the 
variation of color across a scan line), you should have a pretty good description 
of the thing that was sampled. Aliasing comes about when we don’t have enough 
samples for the object under consideration. 

Aliasing effects are prevalent in computer graphics. Like crabgrass in the mani¬ 
cured lawn of a rendering system, aliasing shows up everywhere you don’t explicitly 
address it. To suppress aliasing requires a good understanding of its sources. The 
best way to understand aliasing is to understand what happens to a signal when it 
is sampled; the Fourier transform is a mathematical tool that allows us to see this 
effect most clearly. It also gives us the vocabulary and related tools to discuss aliasing 
problems. 

We will see that while in general it is impossible in practice to remove aliasing 
effects from our images, we usually can contain them or change their character so 
they are less annoying. 

This unit of the book presents material from the field of signal processing that 
is relevant to computer graphics. Our goal will be to develop an understanding of 
aliasing, both in the time and frequency domains. Our principal tool in this analysis 
will be the Fourier transform. 

In this part, I include most of the steps in various derivations and transformations. 
If you get lost between some pair of equations, a good technique is to expand 
everything out in both equations, and look for the simplifications that turn one 
into the other. When a transformation goes beyond basic manipulation by using an 
identity or other powerful tool, I will always mention it. I have also included several 
tables of useful identities and properties to which we will refer throughout the book. 

The traditional signal processing we cover in this chapter is a well-understood 
body of knowledge. It is quite clean and elegant from a mathematical point of 
view, once you get past the sometimes daunting notation. I have tried to make 
the notation as straightforward as possible in this chapter; but much of the heart 
of signal processing comes about from transformations on equations that seem to 
inevitably contain a lot of subscripts, superscripts, limits, and other necessary clutter. 
The equations are typographically complex, but most are simple in concept. 

There are many books on digital signal processing listed at the end of these 
chapters. What I have tried to do here is to select and present just the parts that are 
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most important to rendering. I have tried to be as clear as possible in the discussion 
and the derivations, but I have not always been rigorous. For example, continuity, 
integrability, and other necessary conditions are often assumed, but not proven. The 
goal in these chapters is to present the general ideas behind digital signal processing 
in a way that makes them useful for computer graphics. 

Before we begin, it may be helpful to take a bird’s-eye view of the entire process. 
This is presented in Figure A for a ID signal, and in Figure C for a 2D signal (or 
image). The general idea is that we start with some ID function (say a(t)) that is 
defined for all values of its argument £, and eventually end up with something that 
we present to an observer (using a CRT or other output device), labeled k in the 
figure. What happens in between is the subject matter of this unit. 

Let’s look at the general flow of information; all of these steps will be examined in 
much more detail in the chapters in this unit. The general goal is that we want signal 
k to look as much like signal a as we can. But we’re frustrated in that desire because 
we assume that we are only allowed to gather point samples of a, represented in c. 
The goal then is to somehow get something like a out of c. Anticipating the basic 
ideas of this unit, we will use the words “signal” and “function” interchangeably. We 
will also be a bit informal with the rest of our terms; we’ll sharpen them considerably 
as we move through the book. We will sometimes think of the function as a picture 
we want to show; thinkof it as a grey-level image so we don’t get bogged down with 
questions about color right now. 

We begin with a, which is defined at all points t ; this might be any kind of function 
or procedure. We begin by gathering samples of a at particular points that are given 
by 6, which is a row of pulses that are so narrow they select just a single value from 
a. If we multiply together these two functions we get c, which is 0 everywhere except 
where there’s a pulse in 6; at those places, c has the value of a. This is the process 
of sampling , and it’s typically implemented in a rendering system by a ray-tracer; 
z-buffer scan coverter, or similar visibility algorithm. 

Now we only constructed c in this way because we assumed that we were forced 
to (that is, a was so complicated the only information we could get out of it was a 
set of point samples). The signal c doesn’t look much like a at all; it’s basically a flat 
function with a few spikes. It may seem reasonable to simply connect the spikes to 
approximate a, and in fact that’s one way to go. But we will see that theoretically 
the best way to look at this is to convolve c with a filter given by d. In this case, that 
means we place one (reversed) copy of d on top of each spike in c, scaling that copy 
of d so it has the same height as c at the spike. This act of convolution (symbolized 
by a star (*)) results in e. In this case, d serves to simply connect the spikes, but a 
different choice of d would combine the spikes in a different way. 

Now we will suppose that we want to show the signal on a display device that can 
only switch between color values at a finite number of places. For example, on color 
hardcopy devices this is the smallest blob of ink the printer can make. Typically the 
printer lays down a blob of ink of a certain color, and then moves the print head just 


MGURI A 


The general flow of a ID signal from definition to display, (a) The input signal a. (b) A set of 
equally spaced sampling pulses b. (c) The product signal ab , formed by multiplying the two signals 
{or sampling signal a). (d) A reconstruction filter, (e) The signal c* d, formed by convolving signal 
c with the filter d. (f) A low-pass filter, (g) The result of convolving e with /. (h) A new set of 
sampling pulses, (i) The product gh. (j) The display reconstruction filter, (k) The displayed signal 
i * j. 
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(d) 
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PIOUII B 

The Fourier transform of the general flow of the ID signal in Figure A from definition to display. 
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a bit in order to lay down the next blob. We have no control over what happens 
between the blobs; the inks combine or not as they may. A CRT is similar; we can 
choose what color we want to display at a finite number of locations on the screen, 
but what happens between them depends on the physical device. For a given density 
of display samples, there will be some signals we can’t show. For example, if we can 
display a maximum of 100 dots per inch, then we just won’t be able to show a sine 
wave with a frequency of 1,000 dots per inch. So we filter the signal before sampling 
it, this time trying to get rid of any wiggles in the signal that might be happening 
between our samples. 

We smooth out the signal in e by convolving it with another filter, shown in /. 
The convolution is a little trickier to see in this case since we place a (reversed) copy 
of / over every point in e. We actually did this with d before, but because most of c 
was zero, we never saw those copies. The result is the signal g. 

Now we’re ready to figure out what to display. We make up another set of samples 
represented by h to stand for the frame buffer memory locations and multiply that 
with g ; that gives us i. Each spike of i corresponds to a color we will show. If the 
pulses in h correspond to pixel centers, then each pulse in i gives the intensity at that 
pixel. 

But we’re not done quite yet. Recall that once we’ve displayed a color in a frame 
buffet, or printed it on a page, the color bleeds or the ink smears, depending on 
the output device. Mathematically, what’s happening is that our signal i is getting 
convolved with a function like j that tells how the device spreads around each of 
our samples when it’s displayed. The resulting convolved signal is shown in k . This 
is the signal that actually gets displayed. 

The exact same process is shown for images in 2D. Here the original signal a is 
the ideal function that represents the world that we want to render. Signal c is the 
result of ray-tracing or scan-converting the scene, and i is what we store in the frame 
buffer or output file. Signal j is the characteristic response of the monitor or printer, 
so what we end up seeing is signal k. 

The whole trick of signal processing is to get each of these steps just right: we 
want to pick the right places to sample (that is, place the pulses in 6), and we want 
to use just the right filters in d and / to process the signal. We want to do this all 
efficiently and accurately. 

In this unit we will spend a lot of time designing the sampling signal shown in 6. 
We will be guided by what we call The Sampler’s Credo: every sample is precious . 
This is motivated by the fact that rendering is often very expensive; in scenes with 
millions of objects, every sample can take a long time to compute and involve many 
visibility and shading calculations. We don’t want to waste even one, and we don’t 
want to compute even one that we don’t really need. To make sure we get enough 
samples, we have to use every bit of knowledge we can about a, even to the point of 
building up that knowledge as we sample. 

We need to choose our reconstruction filter in d so that we recover a good 
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approximation to a in signal c (joining the dots together with straight lines is easy, 
but not very smooth; we can do better). Then we need to choose a good low- 
pass filter in / so that we get rid of the quickly varying parts of the signal that we 
can’t display. Typically, we’re given the sampling density in h as a characteristic of 
the hardware and we can’t do anything about it. So we have to smooth out the 
inappropriate quickly changing parts of e prior to its resampling by h with as little 
damage as possible to the parts that we can represent. 

The multiplication steps in this diagram are straightforward, but the convolution 
steps may seem pretty weird. It turns out that they’re easy to understand if we take 
the Fourier transform of the various signals. This gives us a similar but different 
version of each function. The steps where we convolve functions then become multi¬ 
plications of the Fourier representations, which have a very intuitive interpretation. 
For reference, the Fourier transform for Figure A is shown in Figure B, and the 
Fourier transform for Figure C is shown in Figure D. Notice that the multiplica¬ 
tion operators have been replaced by convolution, and vice versa. This duality of 
multiplication and convolution is an important part of Fourier analysis. 

The goal of this unit of the book is to introduce those parts of digital signal 
processing that are necessary to understand these figures, because they represent 
what has to happen inside a rendering program. Sometimes one or more of these 
steps is left out or ignored, and sometimes that’s justifiable; often, though, it isn’t, 
and the result is that we get artifacts in our pictures. 

It’s my opinion that a good understanding of rendering requires a good under¬ 
standing of this flow of information, and that requires a good intuitive feeling for 
the Fourier transform. Chapter 4 starts the unit with the definition of signals and 
systems. We then build to the Fourier transform in Chapter 5 (and in Chapter 6 we 
discuss its more recent cousin, the wavelet transform). The convolution operation 
can be performed as an integration, which can be performed efficiently for very 
complex signals (such as those in graphics) using Monte Carlo methods; we discuss 
these in Chapter 7. We then lock down the interpretation of Figure A with a few 
fundamental theorems in Chapter 8 that tell us exactly how to build our filters in d 
and / given the sampling patterns in b and h under some specific conditions. Chap¬ 
ter 9 takes a look at the more complicated problems that arise when the samples 
are not uniformly spaced; that is, the pulses in b are not all the same distance apart. 
After building up all this theory, we turn to practice in Chapter 10 where we survey 
the algorithms people have developed to turn this theory into efficient and practical 
rendering techniques. 




MOURI C 

The general flow of a 2D signal from definition to display, (a) The input signal a. (b) A set of 
equally spaced sampling pulses 6. (c) The product signal a6, formed by multiplying the two signals 
(or sampling signal a). (d) A reconstruction filter, (e) The signal c*d, formed by convolving signal 
c with the filter d. (f) A low-pass filter, (g) The result of convolving e with /. (h) A new set of 
sampling pulses, (i) The product gh. (j) The display reconstruction filter, (k) The displayed signal 
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(a) 


(c) 


(e) 


(g) 


(i) 


(k) 




(b) 



(d) 





MOIItl D 

The Fourier transform of the general flow of the 2D signal in Figure C from definition to display. 























Far more important than the nomenclature are 
the underlying concepts. We introduce 
nomenclature only in order to be able to talk 
about the concepts. 

F. E* Nicodemus et al. 

(“Geometrical Considerations and Nomenclature for 
Reflectance,” 1977) 



SIGNALS 


AND SYSTEMS 


4.1 Introduction 

In this book we will think of synthetic images as multidimensional signals . This 
chapter presents the basic tools for defining and discussing signals and the systems 
that modify them. We will present several different types of signals and the different 
types of information they can represent. We will also discuss the concept of different 
spaces for representing signals. We show some notation that will prove useful later 
in the book, and present a short catalog of useful idealized signals that we will use 
as canonical examples. 

We will discuss a fundamental technique for characterizing systems, and show 
which signals pass through a system unchanged except for scale. 


4.2 Types off Signals and Systems 

For our purposes, a signal is any parametric function, of any number of input and 
output dimensions, for which we would like to find individual values, or average 
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FIOIIRI 4.1 

(a) y(x) = 2x + 3 is smooth and continuous, (b) y(x) = sin(x) is smooth and continuous, 
(c) y(x) = \x\ is not smooth, but continuous, (d) y(x) = sgn(x) is neither smooth nor continuous. 


values, within some range of the arguments. For example, signals include the distri¬ 
bution of light on a screen, the light falling on a point on a surface or in space, or the 
description of a reflection function on a surface. Since both signals and mathematical 
functions have only one value for any set of parameters, we use the terms signal and 
function interchangeably. 


4.2.1 Continvovs-Tiino (CT) Signals 

The conceptual side of computer graphics often deals with continuous-time (CT) (or 
analytic) signals. These have a symbolic representation that enable us to evaluate 
them for any parameter value; examples include y(x) = 2x 4- 3 and y(x) = sin(x), 
which are plotted in Figure 4.1(a) and (b). An analytic signal need not be smooth 
(i.e., differentiable everywhere), or continuous (i.e., unbroken). 

The term “continuous-time” is unfortunate, because it implies that the parameter 
to the function is a time value. In fact, this parameter (or argument) may represent 
anything, including time, so perhaps “continuous-parameter” would be a better 
term. But the term “continuous-time” is firmly established in the literature, so we 
will use it here. 

The concepts of a continuous-time signal and a continuous signal are distinct; the 
former term only refers to the analytic representation of the function. Figure 4.1(c) 
and (d) show plots of the functions y(x) = |x| and y(x) = sgn(x); the former is not 
smooth, and the latter is neither smooth nor continuous, but both are continuous¬ 
time signals, since they can be evaluated for any value of their parameter x. 
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MOURI 4.2 

(a) An even function f(x) = f{—x). (b) An odd function f(x) = —f(—x). 


The term “continuous” is often used in the signal processing literature when 
the more precise term “continuous-time” is meant. The appropriate meaning is 
usually clear from context. When there is possible confusion, I will use the full term 
“continuous-time” (or its acronym CT). 

We will write analytic signals with parentheses around the index, as /(x). The 
index will typically be x or £, referring to spatial position or time. These letters 
may be considered generic indices; the arguments will apply equally well for any 
interpretation of the parameter. Our arguments to analytic functions will typically 
be either real or complex vectors of one or more components. We will sometimes 
write f(t) simply as / for convenience. 

We say a signal is even if it is symmetrical about the origin; that is, for all x, 
f(x) = /(—x), as illustrated in Figure 4.2(a). A signal is odd if it is antisymmetrical 
about the origin; that is, for all x, f(x) = — /(— x), as illustrated in Figure 4.2(b). 
One mnemonic for this definition is to remember that x 2 is even (2 is even) and x 3 
is odd (3 is odd). Another common example that will be valuable to us later is that 
cosine is even and sine is odd. 


4.2.2 Discrete-Tiara (DT) Signals 

The practical side of computer graphics usually deals with discrete-time (DT) signals, 
also called discrete or sampled signals. These are signals that are only defined 
at particular, discrete locations (typically integer values of the index parameter). 
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(a) y[n] = 2n + 3. (b) y[n\ = sin(n). (c) y[n) = |n|. (d) y[n\ = sgn(n). 


Sampled counterparts of Figure 4.1 are shown in Figure 4.3. Notions of smoothness 
and continuity do not have analogs in discrete signals. 

We will write discrete signals with brackets around the index, as f[n\. The index 
will typically be i, fc, m, or n, referring generically to any integer. 

Because computers store real numbers with finite precision, signals are usually 
quantized when they are evaluated. Sometimes we can avoid this problem by storing 
our samples in a symbolic form (e.g., \/3/2 instead of 0.866), but most often we store 
the results numerically, and thus surrender to the limited precision of the computer. 
Although floating-point numbers are notoriously nonuniform in the quality with 
which they can represent real values, even naive programming seems to perform 
surprisingly well in general. A careful analysis of the quantization error in a computer 
program is notoriously difficult [3,353]; for the most part we will ignore this issue 
in this book and assume that our floating-point number representations are perfect. 


4.2.3 Periodic Signals 

A signal is periodic if there exists some real number T, called the period of the signal, 
such that for all x, f(x + T) = /(x), as shown in Figure 4.4. If a function is not 
periodic, then it is aperiodic . By convention, T is positive; this saves us from needing 
some absolute-value signs later on. The most common form of aperiodic signal in 
practice is one that is everywhere 0 to the left and right of some interval. Any signal 
that is zero outside of some finite fixed interval (called the active interval , the support 
interval , or the region of support) is said to have compact support. We encounter 
signals with compact support all the time in computer graphics. For example, an 
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A periodic function with interval T: f(x + T) = f(x). 


image is 0 beyond the boundary of the raster; light starts streaming into a scene at 
some time and then stops some time later; and objects are at rest, then move, and 
then come to rest again. These are all aperiodic signals with fixed support. 

We will often form periodic signals by repeating an aperiodic signal. To make a 
periodic signal g(t) with period T from an aperiodic signal /(£), we write 


+oo 

g(t)= £ f(t-kT) (4.1) 

k= — oo 

Because this notational fragment will recur frequently in this book, it is worth 
spending a moment now to become familiar with this idiom. It is most common 
in the case where f(t) has compact support; that is, f(t) = 0 for all \t\ > £/, as in 
Figure 4.5(a). 

Equation 4.1 builds a new signal g(t) from f(t) by repeating it every T units 
(for the moment, we assume T > 2tf 9 so the copies don’t overlap). Suppose we 
know the value of f(s) for some real number —T/2 < s < T/2. Then g(s) = 

-1 -f(s — T) + f(s) + f(s + T)-\ -= /(s), since only the value at f(s) is nonzero. 

Now suppose we want the value of g(s + T). Using Equation 4.1 we find that 

g(4 + r) = --- + /((4 + T) + r) + /((4 + T)) + /((« + T)-r) + /((4 + r)-2T) + --. 

= •••+ /(« + 2T) + f(8 + T) + f(s) + f(s-T) +... 

(4 - 2) 

All these values are 0 except f{s); thus, g(s + T) = /(«), so g(t) is a periodic version 
of f(t) with period T, as shown in Figure 4.5(b). If T < tf, then the repeated copies 
of f(t) will overlap and sum together, as in Figure 4.5(c). 
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FIGURE 4.5 

(a) The signal /(£), which is zero for all |t| > tf. (b) The signal g(t) = f(t — kT). This 
repeats the input signal every T units, (c) When T < tf, the copies of / will overlap. 


4*2*4 Linear Time-Invariant Systems 

Anything that alters a signal may be considered a system . For example, a concert hall 
may be considered a system. In this case, think of the sound of a violin as a signal 
represented by the amplitude of sound with respect to time. So a concert hall changes 
an input signal (a violin played on stage) to an output signal (the particular sound 
you hear at some particular seat). In computer graphics, our systems will typically 
be programs, acting either as models of physical systems or in more abstract settings. 
For example, a program to calculate reflection is often based on a physical reflection 
model, while color-space transformations are abstract operations. 

The easiest class of systems to understand are linear systems. For example, 
suppose we have a system C that maps (or transforms) some input x(t) to an output 
y(t). We write y(t) = C{x(t)}; in mathematical terms, C is an operator . An operator 
may be imagined as a device that takes in some object as an argument and returns 
some new object, which is not necessarily of the same type as the input. In this 
case C takes as input a function x(t ), and returns a new function y(t). When we 
drop the explicit argument, we write £{#}, which we often abbreviate simply as Cx. 
To return the argument into this last form, we parenthesize the new operated-upon 
function, writing y(t) = (Cx)(t). 

We say that C is linear if, for any two scalars a and 6, and any two signals f(t) 
and g(t ), the following is true: 

C{af + bg} = aC{f} + bC{g} (4.3) 

This important definition, diagrammed in Figure 4.6, is actually two definitions 
in one. The first states that if we scale an input to the system, the output is an 
equally scaled version of the output to an unsealed version of the input. In symbols, 
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£{af + bg}=a£{f}+b£{g}. 



MOURI 4.7 

£{ag} = a£{g}. 


£{ag} = a£{g}\ this is diagrammed in Figure 4.7. The second property states that 
the response to the sum of two signals is equal to the sum of the responses to the 
individual signals; in symbols, £{f + g} = £{/} + £{g }. This is diagrammed in 
Figure 4.8. These properties are at the foundation of many simplifying assumptions 
that make linear systems easy to analyze and describe. In most of this book, we will 
discuss only linear systems. 
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FIGURE 4.8 

C{f + 9 } = C{f} + C{g}. 


Our definition of a linear system includes many operators with which we are 
familiar. Addition and multiplication are linear, as are summation and integration, 
though the functions square root and floor are not. 

If a system obeys Equation 4.3 for real numbers a and 6, we say the system is 
real-linear . If it also holds when a and b are complex numbers (discussed below), we 
say the system is complex-linear . 

One property we will often exploit is that since integration and summation are 
linear operators, we can move another linear operator C through them: 

c{f /(*)*} = j C{f(t)}dt 

( ~*~ QO ^ +oo 

cl E /[*]} = E *{/[*]} < 4 - 4 > 

'k=— oo ^ k=— oo 

We will use linearity properties often in this book. 

In addition to linearity, we will further restrict our attention to time-invariant 
systems. These are systems that respond the same way no matter when the signal 
arrives. For example, consider an idealized concert hall, where properties such as 
temperature and humidity are constant. If you clap your hands in such a hall, you 
create a sudden pulse of sound that reverberates through the space in a particular 
way characteristic of that room. If you clap your hands again a minute or two later, 
the response is the same; the time you applied the signal is not relevant. This is an 
example of a time-invariant system. In symbols. 


if y(t) = f{x{t )), then y(t - r) = f(x(t - r)) for any r 


(4.5) 
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No real-world physical systems are truly time-invariant. As an example of a 
system that is not time-invariant, consider the energy required to make a pot of cold 
water boil. Initially the pot is cold, so you must not only heat the water but the 
pot as well. Once the water is boiling, you can pour it out and replace it with new 
cold water. Since the pot is already hot, you need less energy to bring this new pot 
of water to a boil than you needed for the first pot. So the same input (a fresh pot 
of cold water) produces different results, depending on what has gone before. The 
output of a time-invariant system to a given signal is always the same, while the 
output of a time-variant system to the same signal depends on the signals that have 
come before. 

A discrete system obeying the similar rule 

if y[n] = f(x[n]), then y[n — m] = f(x[n — m]) for any m (4.6) 

is called shift-invariant. These terms are usually written as acronyms, so the linear, 
time-invariant property is written LTI, and similarly, a linear, shift-invariant system 
is an LSI system. 

In most of this book we will discuss only LTI and LSI systems. Systems that are 
either nonlinear, time-variant, or both are usually much more difficult to analyze and 
understand, though chaos, fuzzy logic, and complexity theory are making fascinating 
progress [2,161,226,227,253]. 


4.3 Notation 

In this section we present some pieces of notation and terminology that will simplify 
the discussions throughout the book. 


4*3*1 Tht Rtol Nwbtrt 

The set of all real numbers is denoted by 71. To indicate that any particular number 
r is real, we write it as a member of this set: r € 71. The symbol for the reals is 
sometimes used to indicate the domain (or set of possible inputs) for a function. If a 
function / takes a real number to another real (e.g., f(x) = 3x), we say that / maps 
the reals into the reals. We say the same thing symbolically as /: 71 71. 

In computer graphics we often deal with spaces with more than one dimension. 
A vertex V of a polygon, for example, may be specified by three real numbers. We 
say that V is drawn from a space built from three reals by writing V e K 3 . A matrix 
M that transforms the vertices of a polygon from one 3D orientation to another may 
be said to map V € K 3 to V' € 72 3 , or M: K 3 K 3 . 
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4*3*2 The Integers 

The set of integers is denoted by Z . Thus if we write k € Z, we are saying that k can 
take on any negative or positive integer value, including 0. The canonical integers 
are denoted by letters from i to n, inclusive (although j will always stand for the 
square root of — 1, as discussed in more detail below). 

A function f[n] which maps an integer to a real, such as f[n] = \/n , can be 
written f:Z^1Z. 

An important subset of the integers are the integers mod N. The word mod stands 
for modulo arithmetic; the value of a mod b is the remainder when a is divided by 
b. For example, 2 mod 3 = 8 mod 3 = 2. We write the set of integers modulo 
some number N as Z/N. For example, Z/ 5 = {0,1,2,3,4}, and the binary group 
Z/2 = {0,1}. Anyone who has worked with integers on a computer has had 
experience with modulo arithmetic. An 8-bit register can hold the integers from 0 to 
255. Thus, in Z/256, the sum 255 + 1 = 0. We will write this 255 + 1 = 0 (mod 256), 
when the nature of the arithmetic isn’t clear from the context. In general, the set 
Z/N = {0,1,2,..., A — 1}. 


4*3*3 Intervals 

An interval is a range of real or integer values. Explicitly, if a is bounded by two 
values ao and a\ such that ao < ai, then the interval of all a satisfying ao < a < a\ is 
written [ao, ai]. If a is not intended to actually include its lower bound ao, then we 
use a round parenthesis for the left extreme: a e (ao, ai]. Similarly, we can exclude 
the upper bound , a € [ao,ai), or both. 

This notation is motivated by the problem of partitioning an interval. Suppose 
we have an interval A = [a, c], where a <b < c. Then we can partition A into two 
pieces A\ = [a, b) and A 2 = [6, c]. Together, A\ U A 2 = A, as in Figure 4.9(a). If we 
defined A\ to include b at its upper limit and A 2 to include b at its lower limit, then 
we would have b represented twice when the sets were combined as in Figure 4.9(b). 
This notation allows us to place b in only one set, avoiding the problem. Two sets 
with no common elements are disjoint . 

A single scalar a may be represented by a degenerate interval [a, a]. Sometimes it 
is useful to specify just the size of an interval without fixing it to a particular starting 
value. We call this a free interval and write it as [a]. Thus [a] represents the interval 
[< 7 , a + g] for any g . We will often see [N] = [0,1,..., N — 2, N — 1], In general, when 
we write k € [N], we mean any sequence [fc, k + 1,..., k -I- N - 2, k -I- N - 1] mod N. 
So [5] = [0,1,2,3,4], though we could interpret it as [5] = [2,3,4,0,1], since where 
we start doesn’t matter. 

In this book, we will represent intervals with capital Greek letters (e.g., T and A). 

If an interval is restricted to the integers, then there are only a finite number of 
values in that range. For example, the integer interval [0,5] is [0,1,2,3,4,5], while 
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MOURI 4.9 

(a) [a, 6) U [6, c]. (b) [a, 6] U [6, c] duplicates b. 


the real interval [0,5] contains an infinite number of values. We can distinguish these 
types of intervals by writing T G Z or Fz for an integer interval, and similarly T e 1Z 
or for a real interval. 


4.3.4 Product Spaces 

It is often convenient to bundle up two or more spaces into one composite space. 
For example, the function f(a,k) = ak multiplies together a real number a 6 1Z 
and an integer k € Z. So the domain of this function is both the set of reals and 
the set of integers. We combine these two individual domains with the operator 
®, which is called the Cartesian product operator , so the resulting space is called a 
Cartesian product space. The resulting space is a combined space; if we form the 
Cartesian product of the integers and the reals, we get a new space that has an integer 
component and a real. An element of this space would be a pair of numbers, say an 
integer k and a real a: (k, a) € Z <g> 1Z. We would write / as taking an argument 
from the product space and returning a real: f:1Z®Zt->1Z, indicating that it takes 
a real and an integer and returns a real. 

As another example, consider the function g = aU, which scales a 3D vector U 
by a real number a. The domain is thus formed by the Cartesian product of the real 
numbers 1Z and the 3D vectors 1Z 3 , so g: 1Z 8) 1Z? »-> 1Z 3 . 

A related idea is the Cartesian sum , written ©, which forms the union of two 
spaces. Since the reals contain the integers, the Cartesian sum of the reals and the 
integers is just the reals: Z© 1Z = 7£, and the Cartesian sum of the reals and the reals 
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(a) The Argand diagram for a point z and its conjugate z. (b) A point z = (x, y) = 
(r cos(0), rsin(O)). 


is just the reals again: Tie'll = 11. For example, if we had a (admittedly strange) 
function that accepted as arguments only the integers mod 5 and the reals, we would 
write its input domain as [5] 0 1Z (or Tl 0 [5], since union is commutative). 


4*3*5 The Complex Numbers 

A complex number is a pair of real numbers: z = (a, b). Complex numbers may be 
interpreted in a variety of ways. One common approach is to write the number as 
the sum of real and imaginary components. 

Imaginary numbers involve the square root of —1. In this book I adopt the 
electrical engineer’s notation and use j for this value: j 2 = — 1. The other common 
choice for this symbol is the letter i . I chose j because it is widely used, and because 
i is a canonical integer index in much of computer science and this book. So we can 
define z = a + jb. We write the real part of z as Re(z) = a and the imaginary part 
as Im(z) = 6. The set of all complex numbers is represented by C. 

Complex numbers are often plotted on an Argand diagram , shown in Figure 4.10. 
This is a standard 2D grid, with the x axis identified with the real component of z and 
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the y axis identified with the imaginary component. Thus a point (x, y) corresponds 
to the complex number x + jy. 

The complex conjugate of any real number 2 is written as 2 or we will use 
the former notation in this book. The complex conjugate is found by negating the 
imaginary component (or; graphically, by reflecting the point across the x axis). Thus 
z = a — jb , as in Figure 4.10(a). 

An alternative to the Cartesian system of the Argand diagram is the polar co¬ 
ordinate system. Here each point (r, 0) is expressed by its radius r and angle 
0 measured counterclockwise from the x axis. The two systems are related by 
(x, y) = (r cos0,rsin0), as in Figure 4.10(b). 

In a 2D coordinate system such as Figure 4.10, the distance from the origin to any 
point P with coordinates (P x , P y ) is found by VP X * + P y 5 . By analogy, we can write 
the squared magnitude of a complex number 2 : as \z\ 2 = 22 = (a + j6)(a — jb) = 
(a 2 4- 6 2 ). We say the phase or angle of a complex number is given by its angle in 
the polar interpretation, so the phase of a complex number z is given by the inverse 
tangent of the angle 0 in Figure 4.10(b): Zz = 

Since every real number a may be considered a complex number a + 0j, we say 
that the real numbers are a subset of the complex numbers: 72 C C. In this chapter 
we will see continuous functions /(z), which map complex numbers to new complex 
numbers. We often say that the domain C is mapped onto itself, or that the function 
maps the complex numbers onto themselves. We say that / is a complex-valued 
complex function if f:C C. Note that such a class of functions includes the 
real-valued real functions f: 72 72, the real-valued complex functions f: C i-» 72, 

and the complex-valued real functions /: 72 C as special cases. For all functions 
/: A 5, we call A the domain of /, and B the range . 

Another handy feature of real numbers is that they are their own co mple x conju¬ 
gates: a+Oj = a—Oj. So if we have a function f(x) € 72, then for all x, /(x) = /(x). 
This will prove useful when writing formulas later on, particularly with the braket 
notation discussed below. 

A Hermite function f(t) is one th at is sy mmetrical about the origin except for 
conjugation; that is, it satisfies f(t) = 

If a system C is linear then its real and imaginary parts are processed indepen¬ 
dently. That is, C{z } = £{Re(z)} + jC{lm(z)}. So to transform a purely real signal, 
we can attach any imaginary part, do the transform, and then ignore the imaginary. 


4.3.6 Assignment and Iqvaiity 

The symbol = is used in this book to indicate a definition (some authors use = or 
simply =). 

The equal sign = is often used in computer languages to indicate assignment to 
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a variable. In pseudocode we will use the left arrow «— to indicate assignment. The 
equal sign will be used only to express equality between two expressions. 


4.3.7 Su in Motion and Integration 

We will often write summations and integrations over the reals or the integers. To 
reduce notational clutter, we will define shorthand forms for these operators. 

For infinite summations, we will indicate the index k over the range [— 00 , 00 ] 
either by k € Z, or simply with the argument k : 

+00 

£ = £=£ ( 4 . 7 ) 

k k£Z k=— 00 


For infinite integrals, the integrand is in the argument, so we write 

I dt =C dt 


(4.8) 


We will often sum over the integer interval [0, N) = [0, N - 1] = [N] for some N. 
We define 


N—l 

E =E 


fcepv] k =0 


(4.9) 


4.3.8 Tho CoMpiox Ixpononfiais 

A famous identity due to Euler is 

e* e = cosQ -I- jsinO (4.10) 

where e is the base of the natural logarithm; e « 2.71828 [458]. This type of 
complex-valued function is called a complex exponential or complex sinusoid . Proof 
of the identity can be found by writing out the Taylor series for e*, and noticing its 
relationship to the Taylor series for sine and cosine. 

Euler’s identity can be used to generate many other identities that will come in 
handy when we perform symbolic manipulations on complex numbers. Table 4.1 
lists many of these identities, along with the definition above and some standard 
results on exponentials from trig, algebra, and calculus. They are labeled El through 
E21 so that we can later refer to different properties efficiently, though several are 
simple variations on another. Some of the less obvious but useful identities are left 
as exercises. We will use some of the more powerful identities later on to simplify 



El 

e ja = cos(a) -I- j sin(a) 

E2 

Refe 7 ") = cos(a) 

E3 

Im(e^ a ) = sin(a) 

E4 

= 1 

E5 

|e->“| = 1 

E 6 

e -j Q — ( e > Q ) 

E7 

e J<*l e j<*2 = e j(Ori+C»2) 

E8 

e j« e -i Q = 1 

E9 

e j2«k+a =e a f OTkeZ 

E10 

= <•»-*£ 

Ell 

J e Qt dt = e at /a 

E12 

e) a - e~i a 

2j = s'n(a) 

E13 

e ;a +e -ja 

-= cos(a) 

& 

E14 

1 _ e “><* = e “j(a/2) j^ e >(a/2) _ e -j(a/2)j 

E15 

^ , • «//o> sin (f + !)) 

> e _jam = e <-J aVV / 2 >->--- - ^- - a ? k2ir, fee Z 

sin ( f ) 

m=0 ' 2 / 

E16 

* \ sin (^-(2W + 1)) 

> e~ 3am =- V2 —- a?k2*,keZ 

m =- W sin (f) 

E17 

e 32 ” k = 1 for keZ 

E18 

e° = 1 

E19 

e-? Q — e~ ja sin(a) 

e 3 & — e~ 3 & ~ sin(/3) ** *'•*«* 

E20 

j e-'^du = 1 

E21 

r w 

/ e~ lat dt = 2Wsinc(aW/n) 

J-W 


TABU 4.1 

Some properties of the complex exponentials. 
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A sinusoid passes through uj cycles in a time period of 27T. 


taking forward and inverse Fourier transforms. The sine function in E21 will be 
introduced in Section 4.4.4. 

We can see from Euler’s relation that a complex exponential is a sum of sine and 
cosine terms. Since each of these is periodic with period 27 t, the whole complex 
exponential is also periodic with period 2n. It is common to write the exponential 
with a mixed exponent involving both frequency oj and time t , as in e Ju; *, and 
associate with it a period T: 

T = 2'k/uj (4.11) 

The frequency is determined by u, while t typically sweeps out the complex sinusoid 
of that frequency. These terms are motivated by a picture such as Figure 4.11, where 
a sine wave passes through u cycles in a time interval of 2i r. One cycle takes up a 
width of T on the time axis. Here again standard notation uses the word “time” 
and index t in these functions, but any parameter would do as well. 

To see that these functions are periodic with interval T, we write the value of any 
such function at times t and t + T as 

e jut _ e ju;(t+T) 

_ e jv(t+ 2rr/u;) 

_ e ju>teV±±t 

= e juJt (4.12) 


using El 7 with k = 1. 
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An eigenfunction is any function that passes through a system unchanged except 
for a constant scaling of its amplitude. Thus, for any f(t) that is an eigenfunction of 
a system <S, S{f(t)} = sf(t) for some (perhaps complex) constant s. That value of 
s for a particular function f(t) is called the eigenvalue for that eigenfunction. These 
terms come from the German word eigen , meaning ^sarne/^ 

The complex exponentials are very important to us in this book because they 
are the eigenfunctions for any LTI system, and the scaling factors are the associated 
eigenvalues. We will prove this below when we discuss convolution. This property 
forms the heart of the Fourier transform, which we discuss in detail later. 

Shorthand Com pi ox Ixponontials 

We will use complex exponentials often in this book. Sometimes we need to have all 
the exponents available in order to carry out a calculation or derivation, but often 
they are just clutter that gets in the way of an intuitive interpretation. 

To reduce this clutter; we will sometimes use a shorthand notation for the complex 
exponentials. We will use primarily four types of these functions in this book, which 
we define as 


= e ju;t 
ipk = e”** 

V4 = e jk(2n/N)n 

fa = e^+vy) (4.13) 

We have already seen 4> and ipkl the other functions will be discussed when they are 
encountered. 


4.3.9 Brakoff Notatio n 

The physicist P. A. M. Dirac introduced a notation called the bra-and-ket or braket 
notation primarily to simplify some common expressions in quantum mechanics. 
These expressions are of the same form as the Fourier expressions, which will occupy 
much of our attention in this unit, so this notation is well suited to our needs. 

The full definition of this notation can be derived from some basic ideas in group 
and measure theories. Building up to the definition is straightforward, but would 
take us far afield from our subject matter; a nice derivation may be found in Reid 
and Passin’s book [357]. We will not be using this notation in its full generality; the 
exposition below is limited just to what we will need. 

There are two pieces to this notation. The first piece we consider is the one called 
a ket (rhymes with bet). A ket is based on a function g , and is written \g). Note that 
the delimiters | and ) are not absolute-value and greater-than signs, but form a single 
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entity along with g; the right bracket) is also narrower than the greater-than sign >. 
In general, we will consider kets to be objects in some space. 

Typically, kets are objects without a coordinate system attached to them. For 
example, given two points A and B in space, a ket could represent A - B. Note 
that A — B completely specifies a measurable, precise quantity, though we haven’t 
defined this difference with respect to any particular coordinate system (e.g., standard 
Euclidean, polar, cylindrical, and so on). 

To turn a ket into a number, we typically use a projection operator , which maps 
the object into some particular coordinate system. In this text, this abstract idea 
boils down to multiplying a ket by a bra (rhymes with la). A bra is also built from a 
function: for the function / the bra is written (/|. 

Multiplying a bra and a ket together yields a braket , written (/| g). If / and g are 
both complex-valued functions on a continuous space (such as 7£), then the braket 
is defined as 

(f\g) = jW)g(t)dt (4.14) 

Note that the domain of integration is not specifically mentioned in the braket; it 
must be understood from context. 

If / and g map the integers to the real or complex numbers, then the braket is 
defined as 

</!$> = £7Ws[n] (4-15) 

n 

An example of the braket is the familiar Euclidean dot product of two real vectors 
A and B: 

( A| B) = (A X B X + A y B y ) = A B (4.16) 

Note that we have used A in the braket, since the braket conjugates its first argument. 
Because of this interpretation, the braket can often be considered very similar to the 
dot product. When the first argument is real, it doesn’t matter if ilfs conjugated 
or not, since a real is its own conjugate. In this case (A|B) = (A|B) is often 
pronounced “A dot B.” 

The pqyygc of the braket notation is that it lets us think about objects such as 
| g) y and ways to measure them such as (/|, without getting bogged down in the 
details of the representation and the measurement. This is the same reason we write 
A • B rather than YliLi A{Bi; they both mean the same thing, but the former is more 
succinct and general. 

The reason for defining the braket this way has to do with the types of formulas 
we will encounter later on. In this book you can usually think of {f\g) as the 
projection of a vector g on a vector /, where we will sometimes use functions rather 
than vectors. 

The reason for conjugating the first argument in a braket is simply to make the 
result easier to use. Because our functions are generally complex-valued, if we take 
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the conjugate of one, then the result will be the integral of the square value of the 
function: 

u\f) = Jmrndt 

= J M*) - + Mt)\ dt 

= J [a 2 (t) + b 2 (t)] dt 

= J f 2 (t)dt (4.17) 

Note that we may use a constant 2 C for either a bra or ket; simply interpret it as a 
function f(z) = 2 C . Also, observe that because the definition of the braket involves 
taking the complex conjugate of the first function, if this is a real number then its 
conjugate is itself. 

The braket is not completely uniform in its properties. Let’s examine linearity. For 
any three functions /, g , /i, and any two complex constants a, 6, the ket is complex- 
linear: 

{f\ag + bh) = a(f\g) + b(f\h) (4.18) 

so this is satisfied for a, 6 G C. But the bra is only real-linear: 

(ag + bh\f)=a(g\f) + b(h\f) (4.19) 

so linearity only holds when a = a and 6 = 6; that is, a, 6 € TZ. 

However, the braket is symmetrical under conjugation: 

(f\g)=WT) (4.20) 

The braket notation is unusual in signal processing, and you won’t find it used 
in too many books on the subject (an exception is Reid and Passin’s book [357]). I 
use it here because the standard notation of Fourier transforms (and the derivations 
leading to them) involves integrals of complex exponentials—such equations involve 
limits and exponents that clutter up the formulas and make them look more complex 
and daunting than they are. It requires some effort and determination to plunge into 
a complex expression and decipher each symbol and its relation to the whole; the 
fewer symbols, the less effort is required. Furthermore, simpler equations are easier 
to understand. If the concepts are understood, and the notation is matched to the 
concepts rather than the mechanics, then we can express relationships among objects 
in a natural way. For signal processing, the braket notation is well matched to the 
concepts we will use, and allows us to write simpler and more intuitive formulas. 

The braket notation is not always appropriate for performing mechanical trans¬ 
formations on functions and equations, so we will often drop back into explicit 
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functional form for such operations. We will also write some important formulas 
in both representations so that they will appear familiar when encountered in other 
texts. 

Sometimes we will want to restrict the limits of the integration of a braket to 
something less than infinity. We can accomplish this by subscripting the braket with 
the desired interval. Thus, for CT functions / and g, if T = [a, 6], 

f(t)g(t)dt= (J\g) [ab] 

= {7\9) r (4-21) 

When no domain of integration is explicitly listed, often we imply the domain of the 
first function; as mentioned earlier, this is usually the interval (—oo, oo). 

A more general definition of the braket involves a weighting function w(t) that 
gives different importance to different regions of the domain being integrated. The 
braket with weighting function is then 

(/Iff) = J f(t)g(t)w(t)dt (4.22) 

which may be written (/| g) w . In this book we generally set w(t) = 1 , so most of the 
time we can safely leave out an explicit weighting function. 



4.3.IP Spaces 

The Fourier transform may be considered a technique for converting the definition 
of a signal back and forth between two forms. We often speak of these forms as the 
signal-space and frequency-space representations. 

The terminology of referring to representations of numbers, functions, and other 
objects as members of a space is quite intuitive once you get used to it, but it can be 
confusing at first, particularly if you begin by imagining some actual physical space. 
In signal processing, the word “space” is used in an abstract way to refer to a style 
of description. 

For example, consider a publisher who prints sheet music for popular songs. 
Most sheet music include the words, melody, and chords of the song. Chords are 
clusters of notes that carry the harmonic structure of the song. There are several 
different ways to represent chords, and a music publisher must pick one (or more) 
to use in the sheet music. Consider the “B-flat-seven” chord, written Bb 7 . One 
option is simply to write out the four notes of the chord: Bb 7 = Bb, D, F, Ab, as in 
Figure 4.12(a). This is rather bulky and rarely used; music notation allows us to 
represent the same four notes more compactly as four black dots on a staff, as in 
Figure 4.12(b). A third common option is to draw a small picture of a guitar neck, 
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Chord spaces for Bb 7 . (a) Listing the chord notes in text, (b) Listing the chord notes on the staff 
lines, (c) A picture of a chord for guitar, (d) Simply name the chord. 


and place black dots where the fingers ought to go, as in Figure 4.12(c). A fourth 
option is to simply name the chord in standard notational style, as in Figure 4.12(d). 

A guitar student who knows no music theory can follow the picture in (c) and 
get the right results; a more advanced musician may read the chord notes in (b) and 
play them as written, or interpret the chord name in (d) as a suggestion from the 
composer, which may be altered to fit the mood of an improvisational performance. 

There are other choices, but we can stop here. The point is that each of these 
representations of a chord carries the same information as the others, but in a 
different way. We can speak of (a) as the text-space representation of the chord, and 
similarly, (b) is the staff-space representation, (c) is the picture-space representation, 
and (d) is the name-space representation. We may think of the chord itself as an 
abstract object, made up of a collection of notes, which is projected into one of the 
spaces so that we may actually play it. In this case we can easily write the rules for 
transforming from the representation in any space to any other. 

The power of using alternate spaces to represent some object is that sometimes 
it is easier to understand some characteristic of that object in a space other than the 
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one in which it was handed to you. To a learning musician, two consecutive chord 
pictures might mean nothing more than mechanical instructions for how to change 
one’s fingers on the guitar neck. Another musician given the same pictures may first 
figure out their names (i.e., project the chord from picture space into name space) 
and thereby understand their relationship. 

The purpose of the next chapter is to define and discuss the frequency-space 
representation of a signal and its implications. The Fourier transform is the recipe 
for converting a signal back and forth between signal space and frequency space . 
Understanding the characteristics of this transform gives us insight into the nature 
of frequency space itself and how it mirrors objects and actions in signal space. 

A synonym for space is domain . For example, we will sometimes speak about the 
representation of a signal in the frequency domain, rather than in frequency space. 
The word domain is overloaded in signal processing, since it is also used to represent 
the allowable inputs to some function, whose output is the range . Normally the 
correct interpretation is clear from context. 


4.4 Somo Useful Signals 

Several signals will prove useful to us for examples and calculations. We summarize 
them below. 


4.4.1 Tito ImpulM Signal 

One particularly useful “signal” is unusual in that it isn’t technically a signal at all. 
Conceptually, an impulse signal is zero everywhere but 0, where it has an infinite 
value. It is an infinitely narrow spike of infinite height, but which integrates to a 
value of 1.0. Strictly, we should call this a distribution . We define the impulse 
signal, written S(t ), also called the Dirac delta function [151], as the distribution 
that modulates a continuous function / such that 


J(*) = 0 if* ^0 

/ oo 

5(t) dt = 1 

-OO 

[ S(t - c)f(t) dt = f(c) c (E [a, b] 
J a 


(4.23) 


To get another view of the impulse function, consider the unit step u(t): 


u 



0 

1 


t < 0 
t > 0 


(4.24) 
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which is shown in Figure 4.13. Note that this function is discontinuous at t = 0. 
The unit step may be expressed as the integral of the impulse function [327]: 

u(t) = -f S(r)dr ( 421 ) 

J — oo 

In other words, the impulse is the derivative of the unit step: 

m = d -f- (4.26) 

Now because u(t) is discontinuous at t = 0, Equation 4.26 doesn’t satisfy the 
conditions for differentiation. But we can think of u(t) as the limit of functions that 
are continuous, and we can see what happens as those functions approach u(t). 

To that end, consider the ramp function r w (t) defined by 

{ 0 t < 0 

t/w 0 < t < w (4.27) 

1 t > w 

as shown in Figure 4.14. This function is 0 to the left of 0, 1 to the right of w , and 
a step from 0 to 1 in the region from 0 to w. The ramp is a continous function. 

Now we can find the derivative of this function without getting into any formal 
difficulty. We call the result S w , and it is plotted in Figure 4.15. 

Now S w is a box of width w and height 1 /w, so its area is 1 for every value of w. 

As w —> 0, S w gets narrower and narrower, but it must become taller and taller to 

maintain unit area. In the limit, 


6(t) = lim S w (t) 


(4.28) 
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The ramp function r(w i t). 



PIOURI 4.18 

The derivative S w of the ramp r w . 


giving us an infinitely thin, but infinitely tall box. So although the value of 8(t) at 
t = 0 is infinite, it has a finite area of exactly 1. 

In particular, we will find it useful to note that 

J f(t)S(t)dt = m (4.29) 

The Dirac delta function is plotted in Figure 4.16(a). We will sometimes call this the 
impulse signal , impulse function , or just impulse. It is typically drawn with a thin 
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(a) The continuous impulse function S(t). (b) The discrete impulse function <5[n]. 


arrow at t = 0, to indicate infinite height, as in the figure. Here the height of the 
arrow is irrelevant. 

The delta function behaves in an unusual way when the argument is scaled. 
Specifically, 

6{at) = T ±rS(t) (4.30) 

M 

To see this, suppose we have an arbitrary function f(t). We set at = r, so t = r /a 
and dt = (1/a) dr, and write 


J 8(at)f(t)dt= ^ J S(r)f dr 


= ]-r/(0) 

|a| 


= /[>» 


f(t) dt 




The introduction of the modulus sign in the second line is motivated by observing that 
because the delta function is defined to integrate to 1, scaling the function should not 
change the sign of the integration. The last line proves Equation 4.30. This behavior 
of the delta function must be kept in mind when one scales its argument during a 
calculation. Because of the absolute-value sign, <S(x) = 5(-x). 
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PIOURI 4.17 

(a) The sifting property for a point to. (b) The sifting property for an interval [a, b]. 


The discrete-time version of the impulse, <5[n], may be written 

, f 1 n = 0 , „ 

S[n] = \ (4.32) 

L 0 otherwise 

and is plotted in Figure 4.16(b). Although the continuous-time impulse S(t) has the 
rather unusual properties listed in Equation 4.23, the discrete-time impulse <5[£] is a 
much more conventional function, which is simply 0 everywhere except at n = 0, 
where it has the value 1. 

Note that to place an impulse at any value k, we may simply shift the signal so 
that k is the new origin; an impulse at k is given by S(t — k). 

The functional definition of the impulse in Equation 4.29 gives rise to the sifting 
property of the impulse signal. For any function x(t) and any value to, we can make 
a signal which is zero everywhere but to, where it has the value x(to). We make this 
signal by multiplying x(t) with an impulse at t 0 . 



S(t - to)x(t) dt 


(4.33) 


This is diagrammed in Figure 4.17(a). This may seem a roundabout way of writing 
x(£ 0 ), but by sliding the impulse through some domain, we can pick out just the part 
of the signal we’re interested in, without explicitly writing each sample individually. 

For example, we can sweep t through the entire interval (— oo, +oo) and end up 
with x(t) itself: 

x(t) = f S(t-r)x(r)dT 


(4.34) 
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(a) 


(b) 
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(a) The continuous box bw(t ). (b) The discrete box bw[n\. 


We can write a similar relationship for discrete signal^: 

x[n] = ^<J[n - fc]a;[fc] (4.35) 

k 

Alternatively, we may restrict r or k to some other interval (or even several disjoint 
intervals). So the piece of x(t) lying between a and b is 

h r 0 t < a 

[ 6(t - t)x(t) dr = < x(t) a < t <b (4.36) 

Ja {0 b < t 

This is diagrammed in Figure 4.17(b). We use the sifting property often when 

calculating Fourier transforms and their inverses. 


4.4.2 Ths Box Signal 

The continuous-time box function , written bw{t), has the value 1 within some 
interval of width W centered at t = 0, and is 0 outside: 



\t\ < W/2 
otherwise 


(4.37) 


It is plotted in Figure 4.18(a). 
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MOURI 4.19 

(a) The infinite continuous impulse train IIIt(£)- (b) The infinite discrete impulse train IIIt[ti]. 


Its discrete-time counterpart bw[n\ is similarly defined: 



|n| < W/2 

otherwise 


This is plotted in Figure 4.18(b). 


(4.38) 


4.4.3 Tim InpulM Train 

A very useful periodic function involving the impulse signal is the impulse train, 
sometimes also called a comb or shah function. This is simply an infinite repetition 
of impulses at equal intervals. Writing the interval as T, the continuous impulse 
train III'r(£) may be written 


III T (t) = 5I <5 (*- fcT ) (4.39) 

k 

The notation III is meant to remind us of a row of vertical spikes, representing the 
impulses. The discrete-time case IIIt[n] is similar: 

III T [n] = 53<S[n-fcr] (4.40) 

k 

Equations 4.39 and 4.40 are plotted in Figure 4.19(a) and (b). 
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PIOURI 4.20 

The sine function sin(7rx)/(7rx). 


4.4.4 The IIrc Signal 

A notational convenience is provided by the sine (pronounced “sink”) signal, sinc(x): 

sinc(x) = (4.41) 

7TX 

plotted in Figure 4.20. 

A couple of particular values for the sine will prove useful later. We define 
sinc(O) = 1. We also note that sinc(fc) = 0 for all integers k ^ 0. Finally, we observe 

J sinc(x) dx = 1 (4.42) 


4.5 Convolution 

One of the most useful operations in signal processing is filtering . A filter may be 
considered any system which modifies a signal; we say that a particular system filters 
the signal, or that the signal has been processed through a filter , or simply filtered . 






1 56 


4 SIGNALS AND SYSTEMS 


Recall that the sifting property of the impulse function allows us to write any 
signal as a continuous sum of scaled impulses: 

f{t) = J f(r)S(t - T)dr (4.43) 

Suppose we send this signal through a linear system C: 

£{/(<)} = £ jy f(T)S(t - t) dr j 

= J f(T)C{6(t - t)} dr (4.44) 

We call the signal h(t,r) defined by h(t,r) = C{S(t — r)} the impulse response 
of the system £. It describes how the system responds to an impulse signal S(t — r), 
which is simply an impulse at t = r. Equation 4.44 may be interpreted as telling 
us that if we know how the system responds to an impulse for each value of t , we 
can find that system’s response to any input by breaking the input into impulses and 
summing together the system’s response to each impulse, weighted by the value of 
the signal at that time. This is an important idea; it is illustrated in Figure 4.21(a). 

If C is time-invariant, then the system responds the same way no matter when 
the input is applied; thus, h(t, r) does not change for each value of r, but is rather a 
single function h(t — r), valid for every r. This is illustrated in Figure 4.21(b). We 
may then write 

C{f(t)} = J f(t)h(t-T)dr (4M) 

We define the convolution operator * to represent this operation: 

f{t) * h(t) = J f(t)h(t - r) dT ( 4 , 46 ) 

where 

h(t) = C{S(t)} (4.47) 

The convolution operator is an infix operator; like the addition operator -b, it appears 
between its arguments. 

Equation 4.46 defines the operation of convolution . We say that g(t) is the result 
of convolving f(t) and /i(£), or that f(t) and h(t) are convolved to produce g(t). 
The asterisk * is often used as the infix convolution operator, though some authors 
surround the asterisk with a circle, as in ®. 

The essential point here is that we have defined one method that allows us to 
find the response of any LTI system to any input signal. To review, because of the 
sifting property of impulses, we can write any signal as a (perhaps continuous) sum 


FI O II ft I 4.21 


(a) A system’s response in terms of impulse responses, (b) The impulse response is the same at all 
values of t. 


of scaled impulses. Since the system is linear, its response to the sum of impulses can 
be found by summing its response to each individual impulse. Since the system is 
time-invariant, it has the same response to an impulse no matter when the impulse 
arrives. 

To use the impulse response of a system to find its response to any input, compute 
the convolution of the impulse response h(t) with the input x(t) and sum, as in 
Equation 4.46. This means to find the value of f(t) * h(t) at some point t, a scaled 
and reversed copy of the impulse response h(t) is placed at t, multiplied with /, and 





158 


4 SIGNALS AND SYSTEMS 



PIOUII 4.22 

An example of convolution, (a) The input signal x(t). (b) The impulse response h(t). (c) The 
response of the system to x(£o). (d) The product of x(t) and h(t) is integrated to find the convolution 
at t. 


the resulting product integrated, as in Figure 4.22. The impulse response is reversed 
because of the negative sign in the definition. 

When the input signal is just a sequence of impulses, we can simply place a 
reversed copy of the impulse response h(t) at each impulse and scale it accordingly. 
If the impulses arrive sufficiently slowly, and the impulse response has finite support, 
then the responses will not add to each other. As the impulses arrive more quickly, 
the responses move closer together, and after a certain point they begin to overlap 
and sum together. This phenomenon is shown in Figure 4.23. 

As an example of this behavior; recall the description of clapping your hands in a 
concert hall. If you clap once and then wait, the sound will echo through the room 
and then (for all practical purposes) eventually fade out to nothing. The sounds of 
successive claps, followed by long pauses, will not add together because each one 
will fade out before the next arrives. But if you clap repeatedly and quickly, so that 
each new clap is made while the sound of previous claps is still reverberating, the 
sounds will add to each other and will be harder to distinguish. The response of 
the system to each input event hasn’t changed, but each response is harder to isolate 
from the others. 

Consider the effect of convolving a finite-support signal f(t) with width Wf with 
a shah function s(t) with period T: 


y(t) = f(t) * s(t) = 


J k 


r - kT) dr 


(4*48) 
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(a) An impulse response h(t) with finite support, (b) An input signal xi(t) of impulses, (c) The 
output x\(t) * h(t); note that the responses are independent, (d) An input signal xi(t) of closer 
impulses, (e) The output xi(t) * h(t); note that the responses overlap. 


AAA AW 


(a) (b) 
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(a) The convolution result when the period of s(t) is larger than the width of /(£). (b) The 
convolution result when the period of s(t) is equal to the width of /(£). (c) The convolution result 
when the period of s(t) is smaller than the width of f(t). 


This is illustrated in Figure 4.24(a). The result is that a copy of f(t) is placed at 
each of the impulses in s(t). When the period T matches the width VF/, the copies 
touch each other, as in Figure 4.24(b). When the period is smaller than the width, 
the copies overlap and sum together, as in Figure 4.24(c). 
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PIOURI 4.28 

(a) A piece of magnetic tape and a tape head, (b) The sensitivity of the head s(p) varies with p over 
the head, (c) The physical convolution of the tape head and a piece of tape. 


Convolution is important because it tells us how to use a system’s impulse response 
to find the output of the system to a given input. This is useful when we need to write 
computer programs to implement linear systems, such as camera lenses or reflecting 
surfaces. We may directly implement convolution as the algorithm to carry out a 
simulation of any linear system operating on any signal, given only the signal and 
that system’s impulse response. 


4.5.1 A Physical Kxampto off Convolution 

Let’s consider a physical example of convolution as described in Bracewell [61]. 
Figure 4.25(a) shows part of an ordinary tape recorder. We assume a signal f(t ) has 
been recorded on a magnetic tape, where the mapping from time to position is linear 
and monotonic: the signal f(t) is placed on the tape at position x = t, where x = 0 
is the physical start of the tape. Upon playback, the tape is pulled at a uniform speed 
over the playback head, which reads the magnetic field off the tape and converts that 
information into an electrical signal that is then amplified. The tape head has a finite 
width W, and its sensitivity s(p) varies over its width as a function of position p on 
the head; we place p = 0 at the center of the head, as shown in Figure 4.25(b). 
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At any given moment t , the head is placed over some section of tape centered at 
x , so the head is in contact with a section of the tape from [x — W/2, x -I- W/ 2], as 
shown in Figure 4.25(c). We will assume that each part of the head responds only 
to the field on the piece of tape immediately in front of it. So the response y(t) of 
the head may be written as an integral of the signal on the piece of tape times the 
response of the head at each point: 



(4.49) 


The convolution integral in Equation 4.49 says that the response at every moment 
is the product of the signal on the tape over some width times the responsivity of 
the head at each point. As the tape streams over the head, different sections are 
integrated, though the response s(p) remains constant. 

In computer graphics, we convolve every time we display an image. As shown in 
Figure C at the start of this unit, the display device takes the signal that we compute 
and combines it with the response of the hardware to create a displayed image. On 
a CRT, each dot we compute is convolved with the Gaussian blob representing the 
footprint of the beam on the fact of the tube; generally this blob is large enough 
that the footprints overlap and the displayed response of each dot is at least partly 
influenced by nearby dots. 

To evaluate a convolution manually requires summing together as many scaled 
impulse responses as there are values in the input signal. When the input signal 
is discrete, or is made up of impulses itself, we can imagine manually placing and 
summing the finite number of impulse responses to find the output of the system. 
But when the input signal is continuous, we cannot compute the convolution directly 
by such a brute-force strategy. We will see later that Fourier transforms offer an 
alternative. 

Convolution of discrete signals can be an expensive operation; if a discrete input 
has N samples and the convolution filter has M samples, then we require about 
MN multiplies and additions to implement the equations above. For 2D signals, N 
might represent the pixels of a 512 x 512 image (so N = 2 18 ), and M might span 
a 5 x 5 grid of pixels, requiring 25 x 2 18 floating-point operations; this may be too 
expensive for some applications (and is probably an overly conservative estimate 
for today’s rendering densities). We will see later that the Fourier transform also 
provides an alternative way to compute convolutions that may be less expensive in 
some situations. 


4.5.5 Tht Rtsponst off Composite Systems 

Convolution has several useful properties. For convenience, we will leave off the 
function argument in the following list, since the properties are true for both CT and 
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f*h , 

h^ 
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n \ n 2 


f*(h 1 *h 2 ) 


(a) 


(b) 


MOURI 4.26 

(a) A serial system h\ followed by h 2 . (b) An equivalent system. 


DT signals. For any three functions x, h\ 9 and / 12 , convolution is 

commutative: x * h = h * x 

associative: x * (hi * h 2 ) = (x * hi) * h 2 

distributive: x * (hi + / 12 ) = (x * hi) -f (x * h 2 ) (4.50) 

These properties allow us to characterize the response of a wide variety of systems. 
The latter two properties, in particular, simplify series and parallel arrangements, 
which are the fundamental building blocks of many more complex systems. 

Consider a pair of systems h\ and / 12 , connected in a series network as in Fig¬ 
ure 4.26(a). The output of the first system is / * h\. The output of the second system 
is (/ * h\) * h 2 . By the associative property, this is equivalent to a single, combined 
system with impulse response (hi * h 2 ): 

(f*hi)*h 2 = f*(hi*h 2 ) (4.51) 

as shown in Figure 4.26(b). So we can precompute h s = hi * h 2 and replace two 
convolutions with one. 

A parallel network is shown in Figure 4.27(a). Here two independent systems 
receive the input /, and their results are summed together. By the distributive 
property, the sum of the outputs may be represented by a single system with impulse 
response (hi + h 2 ): 

(f * hi) + (f * ha) = / * (hi + h 2 ) (4.52) 

as shown in Figure 4.27(b). Again, precomputing h p = hi 4- h 2 lets us save a step 
and clarify our understanding of the system. 

Keep in mind that the term system may be interpreted as a program or procedure 
for many computer graphics applications. Thus two systems in a series may be 
modeled by two procedures A and 5, where the input of B is the output of A. In 
computer graphics, we cascade systems in this way all the time. For example, to 
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(a) (b) 


PIOURI 4.27 

(a) A parallel system h\ and / 12 . (b) An equivalent system. 


render a polygonal scene using z-buffering, we take a polygon as input and transform 
it into a rectangular grid of color values and depth using the scan-conversion program 
A. Then this grid is combined with the existing z-buffer and color grid by the visibility 
resolving program B. The complete image may be smoothed to reduce jaggies by 
a postprocessor C. The point is that we can often decompose a complex task into 
simpler ones (a popular principle of software engineering); when these are signal 
processing tasks, there is a direct correlation to decomposing a system into a set of 
simpler systems. 


4.5.3 Eigenfunctions and Frequency Response of LTI Systems 

We mentioned earlier that the complex exponentials are eigenfunctions of LTI sys¬ 
tems; that is, they pass through such systems unchanged except for scaling by a 
(perhaps complex) constant. We will prove that assertion now. 

Suppose we have an input signal x(t) = e C. Let’s find the response of any 

LTI system to this signal. 

We write the response y(t) as the convolution of the input signal x(t) with the 
system’s impulse response h(t); we will make no assumptions about h(t). 

y(t) = J h(r)x(t-r)dr 
= J /i(r)e a ' (( " T) dr 
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= e ut H(u) (4.53) 

The third step is justified because the system is linear and e wt is a constant with 
respect to r. 

The complex value H(lj) is called the frequency response (or system transfer 
function) of the system. It is a complex constant defined by 



h(r)e^dr 




So an input of frequency u ; passes through the system unchanged except for a scaling 
factor H(u). Recall we made no demands upon the system except that it was LTI. 
Thus for any LTI system, e Mt is an eigenfunction with associated eigenvalue H(cu) 
given by Equation 4.54. 

This fact reveals that one of the easiest types of functions to study with respect 
to LTI systems are the complex exponentials, since they pass through such systems 
unchanged except for complex scaling. If we can represent an input signal as a sum 
of these functions, then we can find the response of the system to each exponential 
individually, and then sum the responses together. The Fourier series and transform 
provide precisely the tools that decompose a signal into a sum of exponentials. 


4.5.4 DiMrtt«-Tin« Convolution 

The discrete-time version of convolution follows from the continuous version with 
almost no changes, except that the integrals are replaced with summations. 

We write the response of a discrete-time system to an impulse at time k as hk[n]. 
We may then find the output y[n\ of a system C to an input x[n] from 

y\n\ = y^x[fc]/tfc[n] (4.55) 

k 

If the system is also shift invariant , then it doesn’t matter when the impulse arrives; 
an isolated impulse at time n will produce the same response as one at time n — k for 

any k . If the impulse response lasts for more than one sample, the copies will sum 

together, but each response is effectively independent of the others. In this case we 
can drop the sample-identification subscript k and write hk[n] = h[n\. 

We can now rewrite Equation 4.55 as 

h*k = Y^ x[k]h[n - A;] (4*56) 

k 
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Equation 4.56 is the definition of the discrete-time convolution sum , usually 
written with the infix operator *. 


4.6 Two-Dimensional Signals and Systems 

The techniques presented earlier in this chapter may be generalized to higher dimen¬ 
sions; we will discuss the 2D signal as a useful special case. Two-dimensional signals 
are important in image rendering. For example, the 2D plane of an image, and the 
2D hemisphere which integrates light arriving at a point, are both 2D signals. A 
signal’s two dimensions need not be spatial. For example, some filtering techniques 
work in time and one spatial dimension. Thus, we could write a function f(x , y) or 
f(x, t) to distinguish these two cases. 

The most general notation is probably f(x i,#2), which avoids any particular 
interpretation of the arguments and allows for easy generalization to higher di¬ 
mensions. This notation is somewhat awkward, however, so we will use /(x, y) 
below. Keep in mind that these two arguments, however, can have different physical 
interpretations. 

This section will briefly review the generalizations of the preceding discussions 
about two dimensions. We will be mostly concerned with showing the nature of the 
generalization, the appropriate notation, and some examples. Most of the principles 
of ID signals and systems generalize in similar ways, so we don’t need to review 
every property individually. 


4.6.1 Linear Systems 

We begin by reviewing some of the properties of linear, time-invariant systems in two 
dimensions. The property of time invariance is sometimes called spatial invariance 
when the signal is defined over spatial dimensions. A 2D system C is linear if for 
two functions /(x, y) and g(x, y), it satisfies 

£{a/(x, y) + bg{x, y)} = a£{/(x, y)} + bC{g{x , y)} (4.57) 

The 2D impulse signal 5(x, y) is defined in a way analogous to the ID signal: 

J[ <5(0,0)/(x, y) dx dy = /(0,0) (4.58) 

and is shown in Figure 4.28(a). Note that this is the product of two ID delta 
functions: 


S(x,y) = S(x)S(y) 


(4.59) 
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PIOUII 4.28 

(a) 8(x, y). (b) 8[m,n}. 

The discrete 2D delta function 8[m, n) may be defined as follows: 

, r i [ 1 m = 0 and n = 0 A , AV 

8[m,n] = < (4.60) 

10 otherwise 

and is shown in Figure 4.28(b). 


4.6.2 Two-Dimensional Brakth 

In two dimensions, the braket becomes a double integral or double sum. For 
continuous-time functions f(x, y) and g(x , y ), 


(/ 1 5) = JJ f(x, y)g{x, y ) dx dy (4.61) 

For discrete-time functions /[m, n] and ^[m, n], 

/[m,n]< 7 [m, n] (4.62) 

m n 
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Two-dimensional convolution. The shaded area is the area integrated to find the convolution at 

(z.y)- 


4.6.3 Convelwtisn 

In general, the 2D convolution g(x, y) of a pair of 2D continuous-time functions 
/(#, y) and h(x, y) is given by 


g(x, y) = f*h= 


= h* f— 


JJ f (*» y)g(x - T],y — £)dT]d£ 
JJ Hx, y)f(x — rj,y — £) dr)d£ 




For two discrete-time functions /[m, n\ and h[m, n], the convolution is similar: 


g[m,n] = f * h= ^^2 f[m,n]g[x - ki,y - k 2 ] 

k i k 2 

-'••/-EE h[m,n]f[x — k\,y — k 2 ] (AM) 

ki k 2 


A graphical illustration of 2D convolution is shown in Figure 4.29. To find the 
convolution of f(x, y) with h(x, y) at any point (xo, yo ), reflect h around the x and 
y axes, and then translate it so that its origin lines up with (#o, 2/o)* Then multiply 
the two signals together at every point and sum the result. As k\ and &2 sweep out 
different values of (xo,2/o)> the entire input plane is swept and every point of / is 
convolved with h. Since convolution is commutative, we can swap the roles of / and 
h in this discussion and the results are the same. 
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4.6.4 Two-Dimensional Impulse Response 

We can find the impulse response of a 2D LTI system as we did for the ID case. 
We begin by using the sifting property of the 2D impulse function to select values of 

f(x,y): 

fi x ,y) = ff f{ri,£)5{x-T],y-Z)dT)dt (4.65) 

Then we can find the response of a system C to this signal by 

C{f(x,y)} = C (// €)t(x - T), y - S) dT) d(, | 

= JJ f(v,0£{S(x-v,v-0} d v<% 

= JJ f(7i,£)h{x,y,T},0dTidS (4.66) 

where h(x, y; 77 , £) = C{S(x — 7],y — £)} is the impulse response of the 2D system. If 
the system is LTI, then as in the ID case the impulse response is independent of the 
location of the impulse; so ft(x, y; 77 , £) = h(x — r),y — £). Thus 

g{x,y) = C{f(x,y)} 

= JJ /(»?* 0W X - thy -0 dr)d£ 

= f{x, y) * h{x, y) (4.67) 


4.6. S Eigenfunctions and freq u e n cy Response 

The eigenfunctions of 2D systems are the 2D exponentials e^ x + y \ The proof is 
similar to the ID case. We find the output by writing the convolution of f(x , y) = 
e v(x+y) w j t h an ar bitrary system impulse response h(x, 7 /), and then apply linearity. 

V(t) = JJ Hv, f )/(* - rj,y - £)dr]d£ 

= JJ hfatie^-^v-^dTidS 
= JJ h{t], Oe“ (x+v) e~ u ^ + UdTid£, 

= e w(x+y) JJ hfatfe-^+VdridZ 

= e^ x+y) H(u) 


(4.68) 
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So an input signal emerges unchanged except for a complex scaling factor 

H(uj), defined by 


H{uj) = JJ dr, dt 


(4.69) 


The function H(u) is called the frequency response or system transfer function of 
the system with impulse response h(x, ?/), as in the ID case. 


4.7 Further Reading 

This chapter has provided only the basics of signals and systems. Digital signal 
processing has become sufficiently popular in recent years that a number of very 
readable and useful textbooks have appeared. 

Good general texts on the basics of ID digital signal processing include books 
by Oppenheim and Schafer [326], Gabel and Roberts [151], and Oppenheim and 
Willsky [327]. In particular, the 1983 book by Oppenheim and Willsky [327] is very 
accessible for study outside of a formal classroom setting. A different approach to 
the derivation of convolution is offered by Castleman [77], and Bracewell [61] offers 
additional discussion of many of the ideas only touched on here. Signal-processing 
code in C is available in the book by Reid and Passin [357]. 

Many signal-processing operations can be quite sensitive to issues of numerical 
stability and precision. Acton [3] provides a good introduction to this subject. 

Multidimensional signal processing generalizes our ID descriptions and presents 
its own challenges. The book by Dudgeon and Merserau [130] reviews that field. 

In this book we take the view that linear systems are easy to study and nonlinear 
systems are hard. This has been the attitude among most engineers and scientists for 
a long time, but it’s beginning to change. A popular discussion of the emerging field 
of chaos theory has been written by Gleick [161]. An introduction to the qualitative 
behavior of dynamic systems may be found in the series by Abraham and Shaw, 
recently collected into a single volume [2]. This book will be particularly appealing 
to many people in computer graphics because of its rich visuals and predominantly 
geometric explanations of phase space. Another recent introduction to nonlinear 
dynamics is the two-volume set by Jackson [226,227]. This also offers rich geometric 
pictures but requires rather more work to understand than Abraham and Shaw’s 
book. 
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4.8 Ixorclsos 


Characterize the following functions as linear or nonlinear, and as even, odd, or 
neither, and prove your characterization. In these functions, x is real and z is 
complex. 

(a) f(x) = 3x — 2 

(b) f(x) — x 3 +x 

(c) f{x) = e - * 2 

(d) f(z) = 3z - 2 

(e) f{z) = z + z 

(e) f(z) = z- Re(z) 

(f) f(z) = z 2 

Kxtrclit 4.2 

Show that the following systems of functions are orthogonal: 

(a) cos nx, n 6 Z and n > 0 on [0, n] 

(b) cos nx, n € Z and n > 1 on [0, 7 r] 

(c) sin( 2 n -I- l)x 9 n € 2 and n > 1 on [ 0 , 7 r/ 2 ] 

Ixtrcift 4.3 

Equation 4.70 provides the definition of the Walsh functions , an orthogonal family 
of functions. Write a program to plot any desired Walsh function, and plot functions 
1, 2, 3, 4, 7, and 10. Do you think this could be a good set of basis functions for 
representing images? Why? 


(f>o{t) = 1 0 < t < 1 



m = 1,2,3,... 

k= 1,2 ,...,2 m " 1 
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<t> 


(2k- 1) 

m+1 


(0 




{ 


4> ( m\2t) 0 <t<± 

(-l) fc+ v^(2t-1) £<t<l 

4> { m\2t) 0 < t < \ 

(-1)*^ ) (2*-1) |<f<l 


(4.70) 


Kxercls# 4.4 

Draw the vectors (j>k[n) = e jfc ^ 27r / 8)n in the complex plane over one period (n = 
0, 1 ,..., 7) for each value of k = 1 , 2 ,..., 8. What happens when the frequency 
wraps (i.e., n > 8)? 

Kxercist 4.5 

Prove that the convolution of two Gaussian bumps is another Gaussian bump. 

Exercise 4.6 

Derive properties E14 and E15. 

Exercise 4.7 

(a) Prove Equation 4.18. 

(b) Prove Equation 4.19. 

(c) Prove Equation 4.20. 

Exercise 4.8 

Prove Equation 4.50. 

Exercise 4.9 

What if the braket didn’t conjugate its first argument? Work out the result of (/| /) 
under that condition. Is the result useful? 


Exercise 4.10 

Graphically convolve the signal f(x) with the filter g(x) as shown in Figure 4.30. 


KxotcIm 4.11 

Find the frequency response H(u) for the following impulse responses. 

(a) h(t) = 5 

(b) h(t) = 5 1 

(c) h(t) = 2t 2 
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M4UII 4.30 

A signal f(x) and a filter g(x) for Exercise 4.10. 


SYSTEMS 











The fundamental pillars on which all successful 
decipherments have rested are five in number: 
(1) The database must be large enough, with 
many texts of adequate length, (2) The 
language must be known ..,(3) There should be 
a bilingual inscription of some sort, one 
member of which is in a known writing system . 
(4) The cultural context of the script should he 
known ... (5) For logographic scripts, there 
should be pictorial references, either pictures to 
accompany the text, or pictortally derived 
logographic signs, 

Michael D* Coe 

( 14 Brea k mg th e M a yz Code," 1992) 



FOURIER TRANSFORMS 


5.1 Introduction 

An image displayed on a piece of paper or on a screen is made up of many small dots. 
Even photographic film has a built-in grain size that limits the spatial precision of 
the image. As we look ever closer at any stored image, we eventually hit the inherent 
resolution limit of the medium. 

We often want to have as much spatial resolution as possible in our media, in order 
to show the crispest possible pictures. But in computer graphics, more resolution 
means more computation, and that also means more time. Often we don’t have 
the resources to make images that have a resolution comparable to film grain, and 
instead we must satisfy ourselves with relatively coarse pictures, typically displayed 
on a 512 x 512 or 1024 x 1024 square grid. Usually the colors that may be represented 
are also limited. On a monitor, we usually have 2 8 or 2 24 different color choices, 
each specified in RGB space to a precision of one part in 2 8 or 2 10 [181]. 

How can such a coarse display possibly capture spiderwebs, the tiny glint on a 
hummingbird’s feathers, or the streaks left by rain on a window? These phenomena 
may seem far too small, subtle, or both, to represent on such a coarse grid with a 
finite number of colors. We might expect to end up with a blocky image of almost 
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uncorrelated colors as we strive to match these fine-scale structures with our rough 
displays. 

The situation is of course not that bad, but the reasons why are not obvious. 
A very large part of the solution is the human visual system, which is why we 
studied it in such detail in the first unit. The HVS will fill in all sorts of detail from 
rough information. We just have to make sure that we’re presenting it with all the 
information we can get our hands on. 

One way to make sure our pictures are as good as they can be is to make sure that 
they contain nothing extraneous or wrong. This sounds fine in principle, but it turns 
out that the mere process of representing an inherently continuous color picture on a 
device with a finite number of spatial locations usually introduces errors of its own. 
These errors are known collectively as aliasing , and they lead to phenomena like 
jagged edges, thin objects that seem to be broken into pieces, and, in animations, 
objects that suddenly appear and disappear. 

The source of this problem is that we are throwing away information when going 
from the high-resolution original to our relatively low-resolution image. We need to 
be very careful what information we lose: if we want to compress a Mozart piano 
concerto, we could simply leave out every occurrence of a flat or sharp note, but that 
would hardly capture the original in a compressed form. To complicate the problem, 
the human visual system will try to fill in for the information we lose. So we need to 
leave information out very carefully in such a way that the image we present to an 
observer is interpreted as a picture as well matched to the original as possible. 

Of all the tools that have been developed for understanding and characterizing the 
quality of this match, in my opinion the most powerful is the Fourier transform. The 
Fourier transform takes a signal and represents it in frequency space. This alternate 
representation allows us to understand what happens when any continuous signal is 
turned into a set of samples. In our case, it will help us follow the transformation of 
a continuous visual image into dots on a screen or page. 

The Fourier transform is a mathematical operator that decomposes a signal into 
a sum of weighted sines and cosines. The inverse transform runs the other way and 
combines those sines and cosines back into a time signal. 

The advantage of the frequency-space representation is that it gives us a new 
vocabulary with which to discuss systems, and new tools for characterizing them. 
Our principal interest is in the processes of sampling and reconstruction. These 
operations are unavoidable in computer graphics, and are the source of aliasing in 
all its forms. They are best discussed in terms of the Fourier transform and the 
frequency-space representation of systems. 

Because the frequency-space representation is central to understanding digital 
signal processing, we will develop that representation in detail in this chapter. We 
will use the ideas from the previous chapter to discuss continuous-time and discrete¬ 
time signals, and LTI and LSI systems in both signal and frequency space. 

We will derive the Fourier transform in some detail because it is one of the 
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most powerful analytic tools we have for analyzing systems. Developing a good 
intuitive understanding of the Fourier transform and its characteristics will serve you 
well when you consider any aliasing phenomena or develop any signal-processing 
programs. The best way to develop that intuition is to understand the reasoning 
behind the transform, rather than simply be able to execute the mechanics. The 
good news is that the principles involved are few in number and elegant in nature. 
I have attempted to phrase the development here so that it is relevant to computer 
graphics; I have omitted much interesting material that is not useful in the practice 
of computer graphics, or in the analysis of graphics systems as we typically consider 
them today. More complete treatments may be found in the references discussed at 
the end of the chapter. 

This chapter will give us the concepts and vocabulary that we will need throughout 
this book for discussing the important problems of sampling and reconstruction, and 
many means of reducing or eliminating their effects. 


5.2 Basis Functions 

This section presents an argument that a sum of complex exponentials can match 
almost any real-valued function. We will get to this result in steps. 

Our first goal is to show that functions may be represented as combinations 
of other, simpler basis functions. Our development will proceed by analogy to 
coordinate systems in Euclidean space. 


5.2.1 Protections of Points in Spec* 

Consider a typical 3D Euclidean space. To locate points, we create a reference frame , 
as in Figure 5.1(a), consisting of an origin and three (usually perpendicular) vectors, 
typically called X, Y, and Z. 

To represent any vector V in this space, we specify three coordinates. We can 
interpret those coordinates in at least two ways. One is that they are the components 
of a single 3D vector V from the origin. Another interpretation is that they specify 
the lengths of three component vectors , one for each coordinate axis. The lengths 
of these vectors correspond to the length of the projection of V on each axis; the 
lengths of these projections may be computed by the dot product of V on each axis. 

A vector V = (x, y , 2 ), in a space with axes represented by unit-length vectors 
X, Y, and Z, may be written as the sum of the three component vectors V x , V y , 
and V 2 : 


V = V* + v y + v 2 

= (V . X)X + (V • Y)Y + (V • Z)Z 


(5.1) 
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FIOURI 8.1 

(a) A 3D reference frame made of three orthogonal vectors and an origin, and a vector V. (b) The 
point V specified by three vectors, one each for each axis of the frame. 


Equation 5.1 is illustrated in Figure 5.1(b). Here we have represented a vector 
as the sum of three scaled basis vectors X, Y, and Z. The advantage of this 
representation is that we can now think of all the operations we might perform on 
any vector in terms of operations on just these three fixed vectors. This type of view 
is called a basis representation , and it is implicit in much of mathematics. 

Basis representations for objects in a space have three properties we will find 
useful: 

■ Operation mapping: Any operations we may want to perform on the objects 
may be carried out by operations on the bases. 

■ Equivalence classes: We can compare any two objects easily, even if they do 
not immediately appear alike. If their basis representations are the same, then 
the two objects are the same (at least to within the characteristics represented 
by the basis functions). 

■ Completeness: A description of the bases and how they combine describes all 
possible objects in that space. 


5.2.2 Pro|octloa off Functions 

Let us now turn our attention from vectors to functions. Consider any ID function 
y = /(#). It is reasonable to ask if this function can be decomposed into a set of 








5.2 Bosis Functions 


1 77 


1 




FIOIIRI 5.2 

Two bar chart functions. 


other, simpler functions, like the projection of a vector onto basis vectors. Such a 
projection would have the properties of a basis representation listed above. 

By analogy to Euclidean space, we will combine our functions using simple ad¬ 
dition. So we might write any function f(t) as a sum of basis functions 0i(£) and 
scalar weights c*: 

f(t) = Ci<t>i{t) -f C2</>2(£) + C3<t>3(t) + • • • + Cn<l>n(t) (-5.2) 

Consider an example class of functions we might like to capture, called the five- 
sample unit-interval bar chart functions , or more simply, the bar chart functions , 
illustrated in Figure 5.2. The bar chart functions are nonzero only between 0 and 5, 
and they have unit height within each interval of integers in that range. 

We can define a set of five basis functions <t>\(t) to (t) to describe any bar chart 
function in the following way: 

{ 0 t<i- 1 

1 i — 1 < t < i i = 1, 2, 3, 4, 5 (5.3) 

0 t > i 

These basis functions are illustrated in Figure 5.3. One way to look at this is to 
imagine that we have created a five-dimensional space (one for each basis function), 
so that each point in that space corresponds to some particular bar chart function. 
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rieuRi 5.3 

The five bar chart basis functions. 


So a point P = {pi,P 2 ,P 3 ,P 4 ,Pb) corresponds to a bar chart function 

b(t) = px<t>x (t) + P2<t>2{t) + P3fo{t) + Pi<j>i{t) + p 5 </>5 (t) 

5 

= (5,4) 

i= 1 

Figure 5.2 shows two examples of points in this space. 

This simple example enabled us to answer by examination two important ques¬ 
tions we must ask of all candidates for bases. We now ask those questions explicitly, 
because they are important when we generalize this procedure to more complicated 
situations. 
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■ How many bases do we need? In 3D space, we need three basis vectors. For 
the bar charts, we needed five basis functions. 

■ Is this set of bases complete? In 3D space, the three vectors are mutually 
orthogonal, so linear algebra tells us that they span (and therefore describe) 
the entire space. For the bar charts, since each bar chart is completely charac¬ 
terized by its five heights, and each function specifies one of those heights, we 
have captured the entire class of bar chart functions. 

Our answer to the second question for the bar charts was not rigorous; we 
would prefer a more satisfying answer that will generalize to more abstract spaces. 
Specifically, we need some tools to ensure that for any space we have enough basis 
functions, and that none of them are redundant. To satisfy the latter condition in 
3D space, we need basis vectors that are linearly independents so much the better if 
they are orthogonal , or all pairwise perpendicular. We can generalize the notion of 
orthogonality to functions. 

5.2.3 Orthogonal Famllios off Functions 

A useful definition of functional orthogonality is usually expressed with respect 
to each pair of functions in the set, within some interval T = [^i, ^ 2 ]- Any two 
complex-valued functions <pi(t) and in the set are orthogonal if they satisfy the 
orthogonality constraint , presented here in braket form (in this section, all brakets 
use the integral form over the complex numbers): 



(5.5) 


To interpret Equation 5.5 it may help to think of the braket as a sort of generalized 
dot product. This tells us that the projection of any function in the set onto any other 
function is 0, just as the projection of any two vectors in an orthogonal reference 
frame (such as X and Y axes in Euclidean space) is zero. When a set of functions 
is orthogonal, this tells us that we can combine them particularly easily to build up 
more complex functions in the space, in a way analogous to the representation of a 
vector in Euclidean 3D space by its projections onto the X, T, and Z axes. 

For example, suppose we had a function that was just a single scaled version of 
one of the basis functions; say f(t) = r<fo(£) for r € 1Z. If the family is orthogonal, 
then this is the simplest way to represent /; it can’t be built from a combination of 
any of the other functions in the family because its projection onto any other basis 
(that is, (/| <t> k ) for k ^ i) is 0. If the family isn’t orthogonal, then one mqremqr^ of 
these projections might be nonzero, and we would have several ways of representing 
the same function /. There’s nothing wrong with that, but it is often advantageous to 
know the simplest possible representation of something; when the simplest version 
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can be used it reduces notational and conceptual clutter. Orthogonal families of 
functions also will make many manipulations easier; as we will see below. 

We say that an orthogonal system is normalized over T if 


(fa\fa) r = l (5.6) 

Any orthogonal system may be normalized by multiplying each function by the 
appropriate scaling factor fa: 


1 — (Pifa\Pifa)r 

= Pi { fa I fa ) r 

so that 

_ 1 

y/ (fa\ fa)r 

We define the norm of a function fa, written \\fa\\, to be 

\\fa\\ = (fa\fa) 

Thus, a system of equations is normalized if ||0i|| = 1 for all i. 

We want to find a family of orthogonal functions that is complete for some 
set of functions /, which means that each / in the set may be represented by a 
combination of those functions, just as any vector in 3D space may be represented 
by a combination of the three primary basis vectors. If no more orthogonal functions 
can be added to such a family, we say it is a complete set. 

We will now see how to find the weights c* for any given / and_choice of fa. For 
simplicity, we will assume that the fa are real-valued functions, so ( fa | fa) = ( fa\ fa). 
Suppose for now that we have found an orthogonal family of basis functions fa(t). 
Recall that Equation 5.2 expressed / as a weighted sum of each fa. If we know the 
bases, we only need to find the c* to write out / in that basis representation. 

Suppose that the summed basis functions fa approximate the original function, 
but are not an exact match. Rewriting Equation 5.2 to approximate /(£), 

n 

/(*)«!>&(*) (5.10) 

i=1 

we ask, What would be the best choice of the weights C{ to minimize the approxi¬ 
mation error within some interval T = [£ 1 ,^ 2 ]? 

We begin by defining the mean squared error (MSE) of this approximation over 
this interval. This name of the error term comes from its construction: at each point 
in the interval, we find the difference between the function and its approximation 


(5.7) 

(5.8) 
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and square that difference. We then find the average (or mean) of this squared error 
over the interval. In symbols, the MSE is defined as 


MSE = 


i r t2 r _!L 

-h-/ /(*)-£**(*) 

-tiJu l fri J 


dt 


(5.11) 


Equation 5.11 contains everything we need to know to generate the weights 
c», but we would prefer a closed-form expression for each weight. To derive that 
expression, we begin by explicitly expanding the summation term and then squaring 
it: 


MSE = —f 2 {f(t) - C\<j>\(t) - C 2 fa(t) - Cn<t>n(t)f dt 

*2 — *1 Jti 

= - 1 f [f 2 (t ) + Ci 2 0l 2 (t) + C 2 2 02 2 (f) + • • • + Cn 2 K 2 {t) 

*2 ~~ *1 Jti 

- 2 ci/(f)<Ai(f) - 2 c 2 f(t)<t> 2 {t) - 2cnf(t)<j> n (t)} dt 


-A-M 


f 2 {t) dt*j + Cj 2 fci + c 2 2 k 2 + • • • 4- Cn. 2 k n 


-2ci7i - 2c 2 72 - 2cn7, 


n} 


(5.12) 


The last expression uses the substitutions 


ki — (0i| 0t)r 

n = {7\<Pi) r 

r = [ti,t 2 ] (5.13) 


In the last expression in Equation 5.12, we have several terms of the form 
(ci 2 ki - 2 ci 7 i). We can complete the square: 


(5.14) 


A 2c <7 i-[ci\/ki JL-^J * 
and rewrite the last expression in Equation 5.12 as 

MSE - {C dt ' + 1 M ‘~ § £ 1 ,5 ' 15) 


To minimize the MSE expressed in Equation 5.15, we drive the squared error 
term to zero by setting 




(5.16) 
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for each i. Dividing both sides of Equation 5.16 by y/ki gives us the closed-form 
expression we seek: 


_ 7 i _ ft i _ (f\<f>i)r /c -7\ 

' * ft^i 2 (t)dt (4 h\4h) r ' 

We have achieved our goal of finding the values of e* that minimize the approxi¬ 
mation error. Recall that Equation 5.10 expressed an approximation of any function 
/ with a set n of mutually orthogonal, weighted, basis functions <j>i(t ),... , </> n (£), 
over an interval [£i, £ 2 ]* For any function / and set of basis functions, Equation 5.17 
tells us how to compute the weight for each basis function. 

One interpretation of Equation 5.10 is simply as a recipe for a change of coordi¬ 
nates from one space to another. The function / is “projected” onto each member 
of (j>i(t ), much as a vector in 3D is projected onto each axis. The magnitude of each 
projection scales the associated base; the scaled bases are then summed together with 
those weights to represent the original function. 




Sometimes we want to use a set of basis functions that are not orthogonal. In that 
case we can transform this “original” set of nonorthogonal bases into a new set that 
is orthogonal. This new set is called the dual basis [421] or reciprocal basis [54]. For 
a given family of basis functions {a(£)}, the dual basis is typically written as {a(t)}. 

The characterizing feature of duals is that they form a basis that is orthogonal to 
the original basis. That is, for real functions a *, 



i = k 
otherwise 


(5.18) 


Duals are useful because they give us the projection coefficients onto the original 
basis when that basis isn’t orthogonal. Suppose we want to represent our function 
/ as a sum of scaled basis functions Qa*: 


= (5.19) 

t=i 

To find these C{ when the {a*} are not orthogonal, we find the projection of / onto 
the duals: 




(5.20) 


</i «fc) = (y^cidi 

' i=l 
oo 

=y2 ci 

t=i 

= Ck 



which gives us c/t, the coefficient on the fcth basis function. So we use the dual 
functions to analyze a function and find its coefficients in some basis, and the original 
functions to synthesize the function back from its coefficients. 

We can construct the dual basis corresponding to a given basis by a standard 
process called Gram-Schmidt orthogonalization , which we briefly review here [421]. 
We will discuss our functions as vectors because the operations have a strongly 
geometric flavor. But this procedure is perfectly applicable to functions; we only 
need to be able to compute the inner product to carry out the construction of the 
duals. 

Suppose we have a trio of noncolinear vectors a, b, and c, and we want to make 
a new trio A , £, and C that are mutually orthogonal. We start by taking A = a, so 
this vector need not be changed at all. Now, to make a vector B out of 6 that will be 
orthogonal to A , we need to remove that component of b that is not perpendicular 
to A . That’s easily found; it’s just the projection of b onto A, as shown in Figure 5.4. 

In other words, we can decompose b into two vectors, 6 = b\\ A + 6_ l .4, one parallel 
to A and one perpendicular. The perpendicular part is the one we want: 


B = b±A = b — b\\ A 
b • A 


= b- 


AA 


(5.21) 


where we have found the parallel projection from the dot product of 6 on A . Now 
we have two mutually orthogonal vectors, A and B. 

To make C, we repeat the process, removing those components of c that are 
parallel to either A or B: 


C = C-C\\ A - C\\b 


— c — 


c • A 
A-A 


A- 


cB 

BB 


B 


(5.22) 


We can now normalize each vector simply by dividing by its magnitude. 

Generalizing this procedure, any set of functions {a*} can be orthogonalized to a 
new family {v*} by the following algorithm: 
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The projection of vector b onto A. 


v\ = «i 


Vi = a,i 


i—i 


-E 


Vk '0>i 
Vk -v k 


v k 


= di 


-E 


fc=l 


(vk\a k ) k 




This process is called Gram-Schmidt orthogonalization. Incidentally, we can write 
the process of Equation 5.23 concisely in matrix notation as V = QR , where the 
columns of Q are orthogonal, and R is an upper-triangular and invertible matrix; 
this is a convenient form for computation [421]. 

If a set of bases is orthogonal, then it is its own dual: {a*} = {aj. We will exploit 
this property often in this book by restricting our attention to orthogonal functions. 


5.2.5 The Complex Exponential Basis 

In this section we will propose the complex exponentials as a basis for most real¬ 
valued functions. We will define the family and then show that they satisfy the 
complex orthogonality constraint. 

Our candidate family for a set of n continuous-time basis functions is the set 


Mt) = 


n e z 


(5.24) 
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(a) n = 0: uj = 0 and T = oo. (b) n = 1: u) = 1 and T = 2n. (c) n = 2: u? = 2 and T = 2n/2. 
(d) n = 3: u; = 3 and T = 27r/3. 


These functions have a period T = 2i r/o;, and they are plotted for n = 0,1,2,3 in 
Figure 5.5. 

To test for orthogonality, we choose an interval of one period, starting at an 
arbitrary time to: [T] = [to, to + T\, We then apply Equation 5.5 over this period to 
two functions ip n (t) and ip m (t): 


rto 

(^nl — / 

J to 


to+2ir/u/ 




e ju,t(n-m) dt 
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If n = m, then e Ju; ^ r L^ = e° = 1 , and the integral simplifies to 


to+2n/u> 



(5.26) 


If n ^ m, we recall from Ell (Table 4.1) that f e at dt = e ot /a, so 



The final reduction comes from noticing that the term in brackets is 0 because of 
El7. Summarizing the two results above: 



(5.28) 


Equation 5.28 was our goal in this section. It shows that the complex exponentials 
indeed form an orthogonal family. They are a complete set as well, but we will not 
prove that here. We will use this set as the basis for the Fourier series and Fourier 
transform below. To normalize this basis, we can divide each function by its 
norm y/T. 

5.3 Representation in Bases off Lower Dimension 

Suppose that we have a function / that takes two n-dimensional real vectors a and 
b and returns a scalar; that is, /: 72 n ® 72 n »->> 72. We have seen that if we have a set 
of basis functions that also map 72 n ® 72 n to 72, we could find a set of coefficients 
that represent the projection of / onto this new basis. 

What may be surprising is that we can represent our function / using bases that 
just map 72 n to 72. We’ll see how to do that in this section. Before we go through the 
math, however, we can draw a picture of the process in 2D that may help explain 
how it works. We’ll map a 2D function /: 72 <S> 72 »—> 72 onto ID basis functions 
(f>: 72 i—> 72. 

Consider Figure Here we have a function f(x , t/), and we have isolated the 
one-parameter function g(y) = f x {y)\ that’s the curve parameterized by y at a given 
x. We ask how to find g(y) in terms of a ID family of orthonormal bases {&(£)}. We 
will assume for discussion that the function / is such that we can match f x {y) with 
just three bases, 6 q , & i , and b 2 . We can then write 


2 


fx(y) = J2hbi{y) 


( 5 . 29 ) 
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where the fc* describe how to combine the bases for this particular slice at x. We 
can find the ki by evaluating three functions, co(t ), ci(£), and C 2 {t ), at t = x; then 
fx(y) is 

2 

fx{y) = ^2 c i( x ) b i(y) (5.30) 

i=0 

These three functions c* are just ID, so there’s no reason not to think of also projecting 
them onto the basis {£>(£)}. The result will be three values for each function c* 
describing how to combine the bases to get C{(t) at any t ; these in turn give three 
values for combining the same bases to find the curve f y (x). You might try to 
convince yourself that these nine numbers correctly capture all the information in 
the original 2D function, given our guarantee that three bases were all that were 
needed. Then given the bases, we need only save the nine scalars, which may prove 
to be a very efficient way to store the function. 

We’ll now show how this works in general. Suppose we have a function 
/: 72 n ® 72 n h-* 72, and a set of orthonormal basis functions <j>: 72 n 72, such that 

every slice of / is in the space spanned by {</>}. Then for any two vectors a, b G 72 n , 
we can write 

/(a,b) = £c m (a)<A m (b) (5.31) 

m 

where we have thought of / as a single curve given by the slice of / where a is held 
constant. We can project / onto the duals of the basis to get the coefficients Cm, and 
since the basis is orthonormal, it is its own dual; so 

Cm (a) = J /(a,b)</> m (b)db (5.32) 

Now we can think of each c^a) as a one-parameter curve in its own right, and 
project it too onto the basis: 


^ ^ d m<n 0 m (a) 

n 




where again we find the coefficients of the expansion from projection onto the basis 
functions: 

dm,n = ^ Cm(^)0n(^) da (5.34) 

Now we can gather the pieces we have just found. Starting again from the 
original definition of / in Equation 5.31, and substituting Equation 5.33 for e^a), 
Equation 5.34 for d m , n > and Equation 5.32 for c m (a), we find 



t 








-a 


o 

-a 


CN| 


ox 































The projection of a 2D function onto ID bases, (a) A slice of a 2D function is a ID function, here made of two humps. The 
basis functions 60 , 61 , and 62 are scaled and summed to match that slice. They are scaled by values of co, ci, and C 2 evaluated 
at 0.5. (b) The functions co, cj, and C 2 are themselves built from the bases, here scaled by do, di, and d 2 . 
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/(a,b) = ^c m (a)(?!> m (b) 

m 

= ^ ^ " d m ^ n (f) n (a)0 m (b) 


c m (a)0 n (a) da > 0„(a)0 m (b) 




= SE{/ 

m n v 

= EIZ{/[/^ a ’ b )^( b ) db ^n(a)daJ<A„(a)<A m (b) (5.35) 


We can organize this as a matrix multiply by pulling one of the basis functions up 
front: 

/(«.b) = j;E 0m (b)fc mn0n (a) (5.36) 

m n 

where 

fcmn = JJ /(a, b)0 m (b)0 n (a) dbda (5.37) 

In summary, gathering the basis functions into column vectors, 

/(a,b) = 0*(b)A0(a) (5.38) 


where (^(b) is a row vector (the transpose of the column vector) of basis functions 
evaluated at b. 

Although we have used a function of two parameters, all of this works for 
functions that map any number of parameters, as long as they are all the same size 
and the basis functions also take vectors of that size. The bookkeeping gets a bit 
more complicated with each level. For example, for three parameters a, b, c we get 
the equation sets (where we will temporarily use j as an integer index): 



f(x,y,z) 

= y 2cj(x,y)<t>i(z)dz 

i 


Ci(x,y) = 

J f(x,y,z)M z )dz 

= ^2dij(x)4>i(y) dy 

3 


dij (x ) 

J Ci(x,y)<j>i(y)dy 

= 5 ~2e ijk (x)<t> k (x)dy 

k 


e ijk — 

[ dij(x)<t>i(x) dx 


(5.39) 
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yielding the equivalence 

f(x,y, z) = ???/ <Pi(z) J <t>j{y ) J (f>k{x)eijkdxdydz (5.40) 


i j k 


We can write this in a more compact form using brackets; this shows the pattern 
behind the general formula at a glance, but would be a mystifying place to start 
because of all the suppressed subscripts and arguments: 


f(x,y,z) = (((™ijk\4>) k€Z \<l>)j €Z <t>). 
= (<< e ufcl^) J6 7?k) y€ 7?|^) 


x€Tl 


(5.41) 


We can always build up a multidimensional transform like this from basis func¬ 
tions that map TZ n *->> 72. If we’re taking the transform of an m-dimensional signal 
using n basis functions, we will end up with n m coefficients. If the signal is dis¬ 
cretized (say into an m-dimensional grid that is k units on a side), then the efficiency 
e of the transformed representation may be written as e = ( k m )/(n m ) = ( k/n ) m . If 
k < n, then the transformation represents a compression of the original data. Often 
this is a lossy compression, so that we can only recover an approximation of the 
original function from the projected form: for example, if we try to represent a cubic 
function from its projection onto linear bases. If k > n, we have an expansion of 
our storage requirements: for example, if we project a linear function onto cubic 
bases. By carefully choosing our bases, we may be able to find a lossy compression 
that saves computation and storage but retains some particular features of interest. 


5.4 Continuous-Time Fourier Representations 

The essence of the Fourier transform is that many real-valued signals may be rep¬ 
resented by a combination of weighted complex sinusoids of different frequencies, 
amplitudes, and phases. 

When this was first presented, it was not an obvious result, and was in fact resisted 
for many years. Interesting discussions of the history of the Fourier transform, and 
the surprisingly complex mathematical politics within which it was developed, are 
given in books by Oppenheim, Willsky, and Young [327] and Bracewell [61]. Briefly, 
the representation of functions as sums of harmonics began with Euler in 1748, who 
used trigonometric series to approximate functions, but abandoned them. The use 
of trigonometric series was disparaged from that point on by many influential math¬ 
ematicians. When Fourier presented his now-famous paper in 1807, his ideas were 
ignored by much of the mathematical community, partly because he did not provide 
a formal basis for his arguments. Only when Dirichlet proved rigorous convergence 



192 


5 FOURIER TRANSFORMS 


conditions for the Fourier series in 1829 did the technique find acceptance. It is now 
an indispensable theoretical and practical tool. 

In this section we will present the definitions for the Fourier series expansion and 
the Fourier transform. Specifically, we will define the Fourier series only for periodic 
signals, and the Fourier transform for aperiodic signals. The power of the Fourier 
transform is that it may be used for periodic signals as well. 


5.5 The Feurier Series 


The Fourier series allows us to represent a periodic, continuous signal as a sum of 
individual complex sinusoids. It is sometimes also called the Fourier series expansion . 

Recall that Equation 5.28 demonstrated that the complex exponentials in Equa¬ 
tion 5.24 are indeed a mutually orthogonal basis on the free interval [T], The Fourier 
series takes any signal x(t) and projects it onto the complex exponentials as a basis. 
The weight we get for each exponential is the proper scaling factor that allows us to 
sum the exponentials back together again to get the original signal. 

Taking the period of a periodic signal x(£) to be of width T, we may thus recover 
x(t) in this interval using the complex exponential basis as 


x(t) = Y / a ke jkut = (a fcltffc) (5.42) 

k 


Note that we have used a T in this equation since the braket conjugates its first 
argument. The coefficients ak are given by Equation 5.17 using the family of Equa¬ 
tion 5.24, and writing fa = finding 




('Pk\x) [T ] 

{^k\^k) [T \ 



{M>k\x) [T] 


(5.43) 


Recall that [T] means any interval of width T. 

Equations 5.42 and 5.43 show how to transform a signal with period T into 
a sum of complex sinusoids, and how to sum a set of sinusoids together again to 
recover the signal. They come naturally out of the orthogonality constraints over an 
interval (hence the restriction to periodic functions). The Fourier series expansion 
uses complex sinusoids as the basis functions, and finds the weights such that any 
approximation of n terms minimizes the mean squared error to the real function. 

We will gather Equations 5.42 and 5.43 together with a small change to define the 
Fourier series expansion of a periodic signal x(t) in the interval [T], where T = 2n/uj. 
We will distribute the normalizing factor 1/T by multiplying both equations by 



(5.44) 
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Some authors choose not to distribute the normalizing factor, but to leave it on 
only one of the equations. I prefer it this way, because it emphasizes the symmetrical 
nature of the equations, and it simplifies the intuitive interpretation of an equal- 
power law we will discuss later. Making the normalization symmetrical will also 
make many of our later equations more symmetrical, and avoid messy scaling factors 
on one side of the transform or another. 

Note that distributing this normalization factor does not change x(t). It just 
means that the a* are scaled by a factor 1 /kt, and we compensate for that change 
by multiplying by kt when we recompute x(t) from the a*. 

The defining relations for the Fourier series expansion are 

x(t) = k t 53 flfc e3kut ~ Kt (®fcl v»fe) (5.45) 

k 

a k = n T J x(t)e~ jkujt dt = k t (il>k\x)[ T ] (5.46) 

Equation 5.45 is called the synthesis equation. Equation 5.46 is called the analysis 
equation. The coefficients a k are called the Fourier series coefficients or spectral 
coefficients for x(t). 

These names come about because they tell us how to analyze a signal x(t) and 
describe it in terms of scaled complex exponentials. The analysis equation represents 
the transformation of x(t) from signal space into frequency space, where we can 
now speak of its various frequency components. To get x(t) back again from this 
frequency space representation, we synthesize it from the coefficients a k . 

This is important news, because the frequency-space representation of a signal 
tells us all sorts of useful information about the signal. For example, suppose that 
there is no high-frequency information (that is, all the exponentials corresponding 
to u > ujf for some up are zero; that means a k = 0 for k > b for some integer 6). 
Then we can compress the representation by simply throwing away the high-order 
coefficients; they’re all zero, so there’s no need to store them. This would also tell 
us that the signal is very smooth and slowly changing. If we want to represent 
the signal with samples, say pixels in a frame buffer, then there is some frequency 
up that corresponds to the most quickly changing signals that we can represent for 
some particular spacing of pixels. This is very important, because it also means 
that if there are frequencies in the signal beyond that point, we will not be able to 
represent them directly. In fact, if we aren’t careful these high frequencies will show 
up in our picture anyway, but in a distorted form we call aliases . The techniques 
of anti-aliasing for battling this phenomenon will occupy much of our attention in 
this and later chapters. The best way to understand aliasing is by looking at the 
frequency-space representation of a signal and considering how much energy it has 
at different frequencies. The analysis equation is our first tool for discovering that 
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information, associating a large value of with a large amount of energy at the 
frequency corresponding to ku. 

Let’s now look a bit more closely at Equation 5.46. We can only apply the series 
expansion to those signals for which this integral converges. 


5.5.1 Convergence 

So far we have no guarantee that the Fourier series for a given signal exists. We 
need a test that we can apply to a function that will guarantee that the expansions 
converge. Such a test is made up by the three conditions of the Dirichlet criteria . 

The Dirichlet conditions require that a function be absolutely integrable . A 
function f(t) is absolutely integrable if 

[ \f(t)\ < oo (5.47) 

Jt 

Two related measures we will find useful later on are energy and square integrability. 
The energy E in a function f(t) in an interval T = [a, b] is defined by 

E(f)r= [ b \m\ 2 dt = (f\f) r (5.48) 

J a 

A function f(t) is square integrable in an interval if E(f) r < oo. 

The Dirichlet conditions that a function f(t) must meet to have a Fourier series 
may be phrased in several ways. One useful formulation given by Oppenheim, 
Willsky, and Young [327] requires that 

■ f(t) must be absolutely integrable. 

■ Within any period of the signal, there are only a finite number of minima and 
maxima. 

■ Within any period, there are only a finite number of discontinuities, each of 
which must itself be finite. 

A finite discontinuity means that the interpolation of the function into the dis¬ 
continuity must not be infinite from either the left or the right. This is illustrated in 
Figure 5.7. 

Common engineering wisdom says that signals that fail these criteria are suffi¬ 
ciently rare in practice that they almost never crop up [151], and this seems to hold 
true in computer graphics as well. Notable exceptions are fractals [195], but they 
are indeed unusual. 

Although it is true that physical signals usually satisfy these criteria, many useful 
abstract and idealized signals (such as the box and impulse) do not. It can be argued 
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(a) The function l/(x — 2) has an infinite discontinuity, (b) The function \x — 2| has a finite 
discontinuity, (c) The step function that is 0 for x < 2 and 1 for x > 2 has a finite discontinuity. 


that these signals represent other signals (which do satisfy the criteria) taken to some 
limit; for example, the impulse may be expressed as the limit of a Gaussian bump as 
the width goes to zero. These arguments can be complex and they are not particularly 
illuminating, so they are not presented here. We will simply assume from here on 
that all the signals in this book (unless otherwise stated), possess a Fourier series 
expansion (or transform), either directly or as the result of some limit argument. 
More discussion of this issue, and details of the limit arguments, may be found in 
the references mentioned in the Further Reading section. 

Another implication of these conditions is the quality of the fit of the Fourier 
representation of a discontinuous signal. The Fourier series will converge even at 
points of discontinuity of the original signal; at these places the new signal is the 
average of the values on the left and right sides of the discontinuity (hence the 
requirement for finite discontinuities). Since the original and Fourier-synthesized 
signals differ only at discontinuities, they have the same energy. 

The way the Fourier series represents a discontinuity in the input signal in its 
synthesized continuous signal is similar to how you would get from one trampoline 
to another slightly to the side and higher up. Standing on the trampoline, you would 
jump up and down higher and higher until at the critical moment you would shoot 
way up, and then fall on the other trampoline, slowly bouncing to a halt. Figure 5.8 
shows this approximation to a step. As more terms are added to the series, the 













FIOURI 8.0 

(a) The signals bounces into and out of a discontinuity, (b) Including more terms in the Fourier 
series, (c) Including even more terms compresses the ringing, but it doesn’t change its amplitude. 
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bounces squeeze in more closely around the discontinuity, but they don’t lose their 
height. This is known as the Gibbs phenomenon . 

The source of the Gibbs phenomenon is that near a discontinuity, the signal 
recovered by the Fourier series synthesis equation looks like a compressed sine func¬ 
tion [61]. As more terms are included in the series, the sine becomes horizontally 
compressed. This pushes more of its larger lobes (and thus the energy in the signal) 
toward its origin, and thus toward the discontinuity. Though the sine may be made 
arbitrarily narrow, it can’t be eliminated entirely; furthermore, its amplitude is not 
decreased by this compression. Thus, there will always be some amount of ringing 
near a discontinuity of a signal synthesized with the Fourier series. 


5.6 The Centinueus-Time Feurier Transferm 

The Fourier series was only defined for periodic functions. But we can extend the 
definition to handle aperiodic functions, which will result in the Fourier transform. 

Our approach will be to enlarge the period of a periodic signal until one period fills 
the entire domain. This is similar to how Fourier generalized the series himself. He 
expressed his equations with respect to signals on a ring, so that they were by nature 
periodic. To find the expression for an aperiodic signal, he suggested enlarging the 
radius of the ring, essentially making the ring’s circumference, and hence the signal’s 
period, arbitrarily long. We will not phrase our discussion in these terms, but the 
essential limit argument remains the same. 

We can approximate an aperiodic signal x(t) in an interval [a, b] with a periodic 
signal x(t ), built so that x(t) = x(t) for t € [a, 6]. When x(t) goes to zero, x(t) will 
repeat the active interval of x(t) over and over. But we can define the active interval 
to include pieces of the zeros on the left and right. In other words, if x(t) is nonzero 
only within an interval of support [a, 6], then we can choose for a single period of x(t) 
an interval [a — d, b + d\ for any real d > 0. In fact, the wider we define the interval, 
the better the match between the aperiodic input and the periodic representation. 

Figure 5.9(a) shows an aperiodic signal x(t) with an active interval [-W/ 2, W/2\. 
Figure 5.9(b) shows x(t ), our periodic match to x(t ). The period is [-T/2,T/2], 
where T > W, so the width of the flat zero region increases as T increases, as in 
Figure 5.9(c). 

Let’s derive the Fourier series coefficients for the periodic signal x(t). We begin 
by recalling the definition for the coefficients from Equation 5.46: 

a k — *>T (i>k\ x )[-W/2,W/2] 

= Kt (^k\ x )[—T/2,T/2] 

= K>T i'&kl x )[-T/2,772] 
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(a) An aperiodic signal x(t ), with an active interval from — W to W. (b) The periodic signal x(t) 
matches x{t) within the interval —T/2 to T/2, with T = 3W/4. (c) The periodic signal with 
T = 2W. 


= (ll>k\x) 

= kt J x(t)e~j kuJt dt (5.49) 


where we chose [-T/ 2, T/2] for the period of x(t ), and then observed that x(t) = x(t) 
within that interval. The last step expresses our knowledge that x(t) = 0 outside 
the interval, so we can send the interval to [— oo, -boo]. Choosing T for the period 
implies the sampling frequency uj = 2i r/T. 

We can write each scaled dk as a sample from a continuous function X c (uj ), which 
we define as 

X c (lj) = J a :(t)e~ jut dt = (^|*) (5.50) 

so that cik is 


dk = Kt J X c {t)e~ iu,t(k) dt 

= K T X c {uk) (i,5l! 

So the Fourier series coefficients are simply equally spaced samples of the con¬ 
tinuous function X c (uj ), which only depend on the defining width T that encloses 
the active interval. Now that we’ve managed to define coefficients for our aperiodic 
signal, let’s see what signal these coefficients synthesize; we’ll call it x(t). 

Recalling the synthesis equation for the Fourier series from Equation 5.45, plug¬ 
ging in the coefficients from Equation 5.51, and substituting T = 2n/uj : 
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X c (ak)e-' ka * 


FIOIJRI 8.10 

The Fourier series coefficients for an aperiodic signal may be represented as equally spaced samples 


ofX c (cj). 


x(t ) = k t (a | ip) 



k 



(5.52) 


Look at the last line of Equation 5.52. We can consider this as an approximate 
integral, built from the sum of many small rectangles, each of height X c (u;k)e jku;t 
and width a;. Figure 5.10 shows this interpretation. 

Recall that we want to push T to infinity, which will bring our approximate 
pgriqdk signal g(t) into closer agreement with an input periodic signal g(t). As 
T —> oo, lj — > 0, and x(t) -> x(£), so this summation approaches an integration. 
Thus we have recovered our original input signal: from aperiodic to aperiodic by 
way of periodicity! Passing the summation to an integral, we find 
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X(t) = ^J X c (u)e jut dt 

= (5>53) 

Together, Equations 5.50 and 5.53 make up the continuous-time Fourier trans¬ 

form (CTFT), our goal for this section. As with the series, we will normalize the two 
equations with a symmetric scaling factor /c, defined by 

k = l/y/2n (5.54) 

Here then is the definition of the Fourier transform: 

X(w) = ~= jx[t)t ju/t dt - K{t\x) (5.55) 

x(t) = jX{w)e? ut db) = ijj) (5.56) 

Equation 5.55 is the analysis equation; Equation 5.56 is the synthesis equation. 
Sometimes the analysis equation is called the forward Fourier transform and the 
synthesis equation is called the inverse Fourier transform . Often the adjective “for¬ 
ward” is dropped, and we refer to taking “the Fourier transform” of some signal. 
The synthesis step is always qualified with the adjectives “reverse” or “inverse.” 

We call a signal x(t) and its Fourier transform X(u) a Fourier pair . When a 
lowercase roman letter is used for a signal, it is common to represent its Fourier 
transform with the corresponding capital. We can write the Fourier transform as an 
operator T , so that X(cj) = T{x(t)}. We also use a two-headed arrow with a small 
T centered above to write Fourier pairs. Here is a summary of this notation: 


X(u) = F{x{t)} 
x(t) = T~ l {X(u;)} 

x(t) X(u) (5.57) 

The function X(uj) is called the spectrum of x(t). It gives the magnitude and 
phase of each complex exponential of frequency u in x(t). If x(t) has any sharp cor¬ 
ners, then high frequencies will be required to approximate those corners; smoother 
functions tend to have less high-frequency information. 
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FIOURK 5.11 

A plot of X(lj) = {\/k)8{(jj — u)q). 


5.6* 1 Fourier Transform of P eriodic Signals 

Our derivation of the Fourier transform was based on an aperiodic input. The 
Fourier transform may also be used to analyze periodic signals; we will see that the 
transform in this case is closely related to the Fourier series. 

Consider a periodic input signal x(t) whose Fourier transform is a single impulse, 
located at u>o and with height 1 /k: 

X{u>) = -6{u> - w 0 ) (5.58) 


This spectrum is shown in Figure 5.11. 

We can find the signal x(t) from the inverse Fourier transform of X(u;): 


c(t) = k j X{uj)e juJi dt 

= k j — 8(u - uq) e ju,t dt 


= e juJot 




So this spectrum represents the Fourier transform of a continuous-time, periodic 
function, namely a complex exponential with frequency u;o. 

We can generalize this result. Suppose that we have some periodic signal x(t) 
with Fourier series coefficients a*. We assert that the Fourier transform of this signal 
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MOUKI 5.12 

The scaled impulses of X(u). 


is a set of equally spaced impulses with spacing u;o, each with height To see 

that this is so, we write this spectrum X(v), illustrated in Figure 5.12, 


X(u>) = - kuJo) 

k K 


(5.60) 


and take its inverse transform: 

x(t) = K (X\ Ip) 

= k ( ^2 — kLUo)e juJt dt 

J k K 

= KT^2 ak [ ~~ k“o)e juJt dt 

k ' 

= K T Y^ akeJUJOt 

k 

= ^tW) (Ml) 

The last line of Equation 5.61 is the Fourier series synthesis equation for x(t) y which 
confirms our assertion. 

So the Fourier transform of a periodic signal with frequency uq (or period To = 
2n/u;o) is a set of impulses, spaced u>o apart, where the height of the impulse at 
uj — kujQ has height 
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5.6.2 PavMvaPs Umonm 

We mentioned earlier that we distributed the normalizing factors kt and k to simplify 
a power relationship. That relationship states that a signal and its transform contain 
equal energy: 

_M- (M2) 

energy density at energy density at t 
In braket notation, we may state: 

</|/) = (F\F) 

E(f) = E(F) (5.63) 

This gives us some confidence that we have chosen a reasonable measure for energy. 
Since the Fourier transform is simply a change of basis, we would not expect the 
energy of the signal it represents to change, and this relationship says that indeed 
it does not. This property may be generalized for two functions / and g to a form 
known as ParsevaVs theorem: 


(f\g) = (F\G) (5.64) 

Parseval’s theorem tells us that the energy in a signal and its Fourier transform 
are the same. Thus if x(t) gets narrower, X(u ;) will become taller, widei; or both in 
order to maintain the energy relationship. The same situation holds in the opposite 
direction. 


5.7 Examples 

We now look at some examples of Fourier series and Fourier transforms. 


5*7*1 TIm Sex Signal 

We begin by finding the Fourier series for the periodic function This is the 

box function of width W repeating at an interval T, shown in Figure 5.13. 

Recall that u = 2i t/T and our interval is T = [-W/2, W/2}. We start with the 
definition from Equation 5.46: 

Qk = (*l>k\ x)[ T ] = Kt J x (t)e~ jkuJi dt (5.65) 

Observe that for k = 0, \j)k = 1, so we can immediately write ao: 

fW/2 

do = kt / 

J-W/2 


dt = ktW 


(5.66) 
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FIOURI 5.13 

The box of width W and interval T. 


For the other values of &, we can expand the fraction and simplify: 


dk = (ll>k\x)[Tl 
f W/2 

= k t e~ jku}t dt 

J-\ 


-W/2 
= ktW sine 


. (Wk \ 
me —— u) 
\2tt ) 


(5.67) 


by property E21 (Table 4.1). Equations 5.66 and 5.67 specify the coefficients for 
k = 0 and k ± 0, respectively. The values of are plotted in Figure 5.14. 

We can find the transform of the aperiodic function bw{t) with the Fourier 
transform. We begin with the analysis equation from Equation 5.55: 

= K x) 
rW/2 


r 

= k I x(t)e JUJt dt 
J-W/2 

S-) 


-W/2 

— kW sine 


(5.68) 


again using property E21. The spectrum of bw(t) is plotted in Figure 5.15. 

- We can observe a few similarities and differences between Figures 5.14 and 5.15. 
Notice that near the origin, they both look like sine functions. The Fourier series 
representation of the periodic signal then turns around and starts to repeat, and it 
remains periodic. The Fourier transform of the aperiodic signal is a sine that rings 
forever with decreasing amplitude. So the series expansion of the periodic box is 
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periodic, and the Fourier transform of the aperiodic box is aperiodic, though they 
have similar shapes near the origin. 

To come full circle, let’s now find the inverse transform of the spectrum in Equa¬ 
tion 5.68. We expect something like the box signal bw(t) that we started with, but 
with the average value of the box at its discontinuities. We begin with the definition 
of the synthesis equation, Equation 5.56, and then substitute for X(uo): 


x(t) = K ^ X(uj) (j^j 

— k[ kW sine (^-uo | 

J V27t ; 

/ 2k 

— sin (uoW/2) e juJt duo 
= 2k’ j (cosM) + jsin(rf)] 


duo 


— — f cos (uot) S * n + j s in(o;t) v '' duo (5.69) 

7 T J UO UO 

Since sin(a) sin(fr) = sin(—a) sin(—6), the entire right-hand side of this last equa¬ 
tion is odd. Note that for any odd function /, f f(uo) duo = 0, so this entire imaginary 
term goes to zero: 

x(() = i/ (5.70) 


sin (uoW/2) 


UO 


Thus we are left with a product of two trig functions. Substituting a = t and b = Wj 2 
lets us write this in a form that can be found in a table of standard integrals: 


x(() = i / 


cosM ?sMi 


UO 


(l \t\<W/2 
^ t = ±W/2 
l 0 \t\ > W/2 


(5.71) 


So we have come full circle, from a box to its transform and back again to 
an almost-box. Equation 5.71 is equivalent to our box function except at the 
discontinuities t = ±W/ 2, where it has the average value, as expected. 


3 . 7.2 Th# Box Spoctrvn 

Suppose that we now reinterpret our box to be a spectrum rather than a time 
signal. By taking the inverse transform of this spectrum, we can find what signal 



5.7 


E xo m p I • s 
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FIGURE 5.16 

The box spectrum 


corresponds to a frequency-space box. We will redefine our box signal b(t) to be a 
box spectrum bw(u): 



M < W/2 
\v\ > W/2 


(5.72) 


This is illustrated in Figure 5.16. 

To find the Fourier transform for this aperiodic signal, we start with the synthesis 
equation: 


x(t) = k(X\< t>) 


= K 


= K 


J X{u)e ju)t dw 

r 


e iuJt duj 


— kW sine 



(5.73) 


by E21. Equation 5.73 is plotted in Figure 5.17. 

Notice that for all the examples we have considered above, there is an inverse 
relationship between the width of the box and the frequency of the damped sinusoid. 
As the box gets narrower, the sinusoid spreads out (that is, it takes longer to reach 
its first zero crossing). As the box gets wider, the sinusoid contracts. 

Intuitively, we can think of the limit of this process as a box of no width, and 
a sinusoid whose first zero is at infinity and is therefore flat. We will confirm this 
intuition when we look at impulse signals. 
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FIOURI S.I7 


To get a feeling for this trade-off, think of the box spectrum as telling us that to 
create a new time signal, we add up complex exponentials with frequency from 0 to 
some upper limit. With a low cutoff frequency, the synthesized signal will be gentle 
and smooth. As we add higher and higher frequencies, we can begin to include 
sharper corners in our signal. So the wider the box is in frequency space, the higher 
the frequency of the exponentials that we sum, and the more angular the synthesized 
function can be. 


5*7.3 THt Gaussian 

The unnormalized Gaussian function is a smooth bump, which may be given by 

g ( t ) = e -* 2 /" 2 ( 5 . 74 ) 


The area under the Gaussian is 


/ 


o - t 2 /* 2 


1 

V2tT(7 2 


K 

O 


( 5 . 75 ) 


The factor a 2 is the variance , and its square root o is the standard deviation . 

The Gaussian is particularly attractive for several reasons. Consider the Fourier 
transform of a Gaussian f(t) = e~ nt , where we set n/cr = 7r. 


F(uj) = K(iP\f) 
e 


= k J e- nt2 - jut dt 
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(a) (b) 


MOURE 5.IS 

(a) A Gaussian bump, (b) Its Fourier transform. 


=k ! 


e ° e -7 r [t-»-Ou;/27r)] : 


dt 


(5.76) 


We can complete the square and solve for a = (— u; 2 /4tt) in the last equation. Since 
this is independent of t, we can pull it out front. Then by substituting u = t+(jui/2n), 
and du = dt, we can write 


F(u) = Ke _u,2/4 ’ r J e~* [t+Uu/2ir) fdt 

= Ke-" 2 / 2 * J e~™ 2 du 

= Ke-" 2 " 2 (5.77) 


where we have used property E20. So the Fourier transform of a Gaussian bump is 
another Gaussian bump, as shown in Figure 5.18. 

The Gaussian is closely related to the complex exponentials, which as we have 
seen are eigenfunctions of LTI systems. It is because of this connection that Gaussians 
pass through the Fourier transform unchanged in form: the Fourier transform of a 
Gaussian is another Gaussian. 
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?{*(*)} = *• 


5.7.4 Ihm Im|wIm Signal 

Taking the Fourier transform of the impulse x(t) = 6(t) is easy: 


X(u) = k(^\S) 

= k J 8{t)e~ juJt dt 


= Ke 
= K 


(5.78) 


Thus the Fourier transform of an impulse is a flat spectrum, with equal energy at 
every frequency, as shown in Figure 5.19. The opposite interpretation is also true. 
Starting with an impulse X(u>) = 5(u;) in the frequency domain, we can take its 
inverse Fourier transform: 


x(t) = k(<S(u>)| V’) 
= Kil’(O) — ne° 

— K 


(5.79) 


So the inverse transform of an impulse is a flat signal with equal height over all time, 
as shown in Figure 5.20. 

We sometimes say that such a flat signal is a DC signal, by analogy to the 
voltage-time plot of direct current transmission of electrical power. The amplitude 
of the Fourier transform of a signal at uj = 0 is sometimes called that signal’s DC 
component . Thus a flat signal is pure DC, since its transform only has energy at 
uj = 0. An impulse in time corresponds to a spectrum with equal energy at all 
frequencies. 
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The impulse train. 


5.7.5 TIm InpulM Train 

Recall the shah function, or impulse train, from Equation 4.39 where the impulses 
appear at equal intervals of width T: 

IIl T (t) = ^2S(t-kT) (5.80) 

k 

This function is plotted in Figure 5.21. 

Let’s find the Fourier transform for this signal. We begin by recalling from 
Equations 5.60 and 5.61 that for any periodic signal x(t) with frequency (recall 
uo = 2n/T) y and Fourier series coefficients a*;, the strength of the signal at each 
harmonic kujQ of the signal’s frequency is Krak- 

We can apply this observation directly to finding the transform of the periodic 
shah signal. So we begin by finding the Fourier series coefficients over a period 
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centered at the origin: 


ak = K T (ip\x) [T] 

= Kt (ip I ^)[-To/2,To/2] 

= *Tlp( 0) 

= k t (5.81) 


So all the coefficients are the same. We now plug these into the Fourier transform 
for periodic signals from Equation 5.60: 


X(uj) = —dk6(u; — kuJo) 

k K 

— —8(uj — kuj o) 


(5.82) 


Equation 5.82 is plotted in Figure 5.22. The essential point to notice is that 
a sequence of equally spaced impulses is turned into another sequence of equally 
spaced impulses, though their heights and intervals are different in the two spaces. 
In symbols, 

IIlTo(*)^4 n UM 


( 5 . 83 ) 




5.8 D u o I i ty 


2 1 3 


Notice that as the interval Tq in signal space decreases and the pulses start to 
arrive closer together, the interval 2n/To in frequency space gets larger and there is 
more room between impulses in the spectrum. We will see in the next chapter that 
this inverse relationship between impulse intervals is the reason we can sometimes 
reduce aliasing by sampling more often. 


5.8 Duality 

The Fourier transform possesses a very useful property known as duality. The 
intuitive interpretation is that if you have a picture of two functions that you know 
are a Fourier transform pair, then it doesn’t matter which one you label “signal” 
and which one “spectrum”; both labeling schemes are correct! Note that we are 
speaking here only of the Fourier transform and not the series representation; the 
latter does not share this property. This form of duality is only strictly true for the 
symmetrical definition used in this book; if one puts the normalizing factors on just 
the analysis or synthesis equations, a squared normalizing factor or its inverse will 
have to be inserted on one side or the other of the duality relation. 

We have intimated duality with the box and impulse transform examples in the 
previous section. Consider the following Fourier transform pairs: 

S(t) K 

k <£+ 6{oj) (5.84) 


and 


kW sine 


bw(t) <—> kW sine 
T 


. ( W \ 

me uj— 

\ 27 t) 


(■£) 


bw{u) 


(5.85) 


These rather remarkable pairs of transformations are not unique, but are represen¬ 
tative of the general principle of duality. 

To be a bit more general, suppose you have a function f(t) with an associated 
Fourier transform g(u>) = T{f{t )}. Now think of g(u) as a time signal; that is, 
simply replace u with t in the definition of < 7 , resulting in g(t). What is the transform 
of this signal, h(uj) = T{g(t )}? The principle of duality tells us that h(u>) = 

We will first show this principle using integral notation. Write two functions / 
and g in terms of two variables u and v: 

f(u) = kJ g(v)e- juv dv 


( 5 . 86 ) 
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Then with u = t and v = -u;, 

/(«) = k J g{-u)e jtu; dw (5.87) 

which is the synthesis equation for the signal f(t) with Fourier transform g(—ui). 
Now substitute u = u and v = t to get 

f(ui) = k J g(t)e~ ju>t dt (5.88) 

This is the analysis equation for signal g(t) with spectrum f{u>). 

Thus we have the two related Fourier pairs 

g(t) f(u) 

m ^ »(-w) (5.89) 

We can write the same thing in braket notation. Here we will include the domain 
of the functions explicitly, since the very nature of duality makes it unclear from 
context: 


g{u) = K(il>(t)\f{t)) 

g(t) = (5.90) 

Duality is important primarily for its conceptual power, relating the two domains 
in such a symmetrical way. Duality also immediately doubles our repertoire of 
Fourier transform pairs, since anything calculated in one domain can be immediately 
applied to the other. 

Figure 5.23(a) shows a signal f(t ) and its transform F(u>) from a table. We can 
immediately write a new transform pair F{f(u>)} = F(— a;), as in Figure 5.23(b). 


5.9 Filtering and Convolution 

Recall the definition of a system’s frequency response in Equation 4.54: 

H(u) = J h{T)e UT dr (5^JJ 

This definition is of the same form as the Fourier transform analysis equation. This 
relates the two methods we have seen for characterizing a system, using either the 
impulse response or the frequency response. Thus we have the very useful fact that 
the system response H(lj) and the impulse response h(t) form a Fourier pair: 

h{t) *£+ H{u) 


(5.92) 
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(a) A signal f(t) and its transform, (b) The signal of (a) interpreted as a spectrum and its inverse 
transform, given by the duality property. 



A linear filter modulates a signal by attenuating or magnifying each exponential component indi 
vidually. 


The frequency response of an LTI system is a powerful way to characterize a sys¬ 
tem completely. This is because we can break down any input into a sum of complex 
exponentials, compute the response of the system by simply applying the complex 
scaling factor to each exponential, then again summing together the weighted expo¬ 
nentials. This process is diagrammed in Figure 5.24. 
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This is the process of filtering . As an example, filters are often provided explicitly 
in audio electronics equipment. The bass and treble controls on a stereo are filters for 
the low and high frequencies: increasing the bass response means boosting the lower 
frequencies as they pass through the amplifier. In other words, H (u>) is increased for 
values of u in the range called “bass.” A graphic equalizer is a similar device, but 
it has multiple controls, each controlling its own interval of frequencies. Everyday 
filters also appear in lighting equipment; a tinted lamp shade reduces the amplitude 
of some frequencies of light. 

Because of the importance of filters, we will now look more closely at filtering 
in both signal and frequency space. We will see that we can choose to apply a 
filter in frequency space or in signal space, depending on which is more intuitive or 
computationally practical. 

We have seen that the response y(t) of any LTI system may be found from the 
convolution of the input signal x(t) with the system’s impulse response h(t). So we 
begin our study of filtering by looking at the Fourier transform of a convolution. We 
want to find Y (u>), the Fourier transform of f(t) * h(t). 

Y (a;) = «(t/>| f *h) 

= k j {f(t)*h(t))e jut dt 

= « / (/ - r) dr) <**dt LM 

We can switch the order of integration, writing 


js 

3 

% 

"SS 

1 

se 

I! 

S* 


= kJ f(t)e~ j “ T ^H{uj)dT 


= H(w) J f{t)e~ j “ T dT 
= H(w)F(w) 

cm) 

?{M*h(t)} = F(u,)H(u) 

(5.95) 


where in the second step we applied the delay property from Table 5.1, which states 
that 


F{g(t-t 0 )} = e-i“ t °F{g(t)} 
= e- j “ to G(u) 


(5.96) 
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Property name 

Transform pair 

Transformation 

/(O^F(w) 

Linearity 

af(t) + bg(t)+—HtF(u>) + bG(u>) 

Duality 


Scaling 


Delay 


Modulation 

- u> 0 ) 

Convolution 


Multiplication 


Time differentiation 

£if(t)^(j") n F(u>) 

Time integration 

f-oc ^ T >(0)«H 

Frequency differentiation 


Frequency integration 

m+z+j F (u,')du,' 

Reversal 

f(—t)+—*F(—u) 


TAS LI 3.1 

Fourier transform properties. 


The principle of duality tells us that there is also a symmetrical relationship to 
Equation 5.95, specifically that two spectra may be convolved by multiplying their 
associated time signals. This is known as the multiplication property: 

x(u>)*H(u>) = f{t)h{t) 

The central point of this section may be summarized in the following way. Con¬ 
volving two signals is identical to multiplying their spectra. Convolving two spectra 
is identical to multiplying their signals. In symbols, 

F(u>)H(u>)=F{f(t)*h(t)} 
f(t)h(t) = F{F{u>) * H{u)} (5.98) 

Equation 5.98 is important both theoretically and practically. Recall from our 
discussion of convolution that a direct implementation of convolution can be very 
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expensive. When signals become sufficiently large, it may be cheaper to take the 
Fourier transform of both signals, multiply their spectra, and then take the inverse 
transform back into signal space. This is even more attractive if the frequency 
response H(u>) can be precomputed and saved. 

To find the impulse response h(t) from an input x(t) and an output y(t ), we 
observe that y(t) = x(t) * h(t), or equivalently, Y(<jj) = Thus 




(5.99) 


so the impulse response h(t) may be found from 



(5.100) 


If we measure the output y(t) for a known input, we can find the impulse response 
numerically. If we know the output analytically and we can find the appropriate 
Fourier transforms, we can find an analytic expression for the impulse response h(t) 
to characterize the system. 

In general, almost all filtering tasks can be usefully examined in frequency space, 
where we ask what happens to the spectrum of a signal as it passes through some 
system. Typically an important part of the analysis involves considering how the 
spectrum of the signal is scaled by the frequency response of the system, which for 
a linear, time-invariant system is the Fourier representation of its impulse response. 
Thus such a filtering task may be viewed either as multiplication of two spectra or 
convolution of the signals. In practice, both approaches have advantages in different 
hardware and software settings. 

Convolution has an intuitive interpretation when one of the signals consists only 
of impulses. For example, suppose a system with frequency response H(lj) is given 
as input a signal x(t) that is a string of impulses: 



(5.101) 


n 


Then as we have seen, X(cj) is also a string of impulses, so T{x(t) * h(t)} = 
X(uj)H(u;) with X(u ;) from Equation will yield 


Y(u>) = X(u)H(u>) 

Y(u) = ^^S(u- k2n/T) H{u) 


(5.102) 
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The frequency-space equivalent of convolution with the shah function. 


Equation 5.102 shows us that when a series of pulses is convolved with a signal, 
the resulting spectrum is built by taking samples of the system’s frequency response. 
This is shown in Figure 5.25. 


5.9.1 Son# Common Flltors 

Of the many common types of filters, three in particular will be useful to us. We 
will first review these filters in their ideal (or perfect) forms; they are illustrated in 
Figure 5.26. 

An ideal (or perfect) low-pass filter allows frequencies below some threshold ujt 
to pass unchanged, while those above the threshold are removed; in other words, an 
ideal low-pass filter L(u) has frequency response 



id < u>t 
id > id? 


An ideal high-pass filter H(uj) has the opposite characteristic: 


(5.103) 



id < idT 
id > U>T 


(5.104) 


An ideal band-pass filter B(uj) combines the previous two, passing only those fre- 
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CO'j' 

(a) 




(b) 


(c) 


FIOURI 8.16 

(a) An ideal low-pass filter, (b) An ideal high-pass filter, (c) An ideal band-pass filter. 


quencies with a given band 

{ 0 u <u)i 

1 ul < w < uh (5.105) 

0 u ; > u>h 

Because each of these filters has a finite width in frequency space, they have 
infinite width in signal space; such filters are called infinite impulse response (HR) 
filters. By comparison, a filter with finite support in signal space is called a finite 
impulse response (FIR) filter. 

Consider trying to implement an ideal low-pass filter in signal space. Since 
the impulse response is infinite, its implementation requires convolving with the 
signal over all time, from before the big bang to after eternity. This is impossible, 
so somehow we need to convert our ideal HR filters into realizable FIR filters. 
The typical approach is to window the filter, by multiplying the impulse response 
with some function that has finite support. Of course, this changes the impulse 
response, which changes the filter characteristics in frequency space. Some standard 
windowing functions include the box, the Gaussian bump, and the sine function. The 
trade-offs involved in windowing a filter appropriately are complicated; pointers to 
the literature appear in the Further Reading section. 

When a filter is windowed to make it physically possible, the perfect behavior 
shown in Figure 5.26 is degraded. In general, each of the sharp discontinuities is 
smeared into a range of frequencies, and the filter response rings near the transitions. 
Figure 5.27 shows a close-up near the discontinuity in a low-pass filter. The ideal 
threshold frequency ujt is spread out into a transition band (u p ,uj s ), within which 
the filter’s response drops off. The region below uj p is called the pass band , and the 
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FIOURI 3.27 

A realizable filter has no sharp discontinuities. 


region above u> s is called the stop band . The locations of uj p and uj s are generally 
determined by the application; they are typically placed as near the discontinuity as 
possible, without allowing the response in the pass- or stop-bands to stray “too far” 
from flatness. Notice that the filter ripples, or rings , near the transition band. As 
a filter approaches the ideal, it rings less, the response in the stop and pass bands 
flattens, and the transition band narrows; this causes an unavoidable widening of 
the impulse response in the signal domain. 

We will often use the ideal filters in our theoretical material, though we will 
always need to deal with realizable filters when discussing implementations. 


5.10 The Fourier Transform Table 

Table 5.1 lists a number of properties of the Fourier transform, with their common 
names. We will not prove these properties, since we will only be using them oc¬ 
casionally. Detailed proofs may be found in most signal-processing texts. Because 
of duality, we can reverse the meanings of the variables t and u;, as in the case of 
multiplication and convolution. 
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5.11 Discrtft-TiR»t Fourier Representations 

Most computer graphics are concerned with sampled signals. The basic ideas of 
Fourier transforms can be applied to such signals, but there are a few subtle, signifi¬ 
cant changes that make the results quite different. 

In this section we will derive expressions for the discrete-time versions of the 
Fourier series and Fourier transform. Remember that as with the term “continuous- 
time,” the phrase “discrete-time” actually refers to any discrete parameter. 


5.11.1 1km Dlscroto-Tino Poorior Sorios 

We begin by deriving the expressions for the discrete-time Fourier series. We will 
base our discussion on finding the transform for a periodic signal x[n], which is 
defined over the interval [0, N) = [AT], Since x[n] is periodic, x[n + N] = x[n}. 

We begin with the basis functions. By analogy with the Fourier transform, we 
will use sampled complex exponentials with period N. The complete set of these is 
given by 

V4M = e jk ^ /N)n , k 6 [N] (5.106) 

When the index k is part of a summation, or is otherwise understood in context, we 
will usually omit it and write simply tp'[n]. 

In contrast to the continuous case, there are only N discrete complex exponentials 
of period AT. To see this, write the value of ip'k+NW] for any integer k: 

V4 + N[n] = e^ k + N){2 * /N)n 

= e jk{2n/N)n e jN(2n/N)n 

= ip' k [n]e j2nn 

= ip' k [n) (5.107) 

where the last line used El7. So after we have created the first N sampled complex 
exponentials of period AT, we start repeating them, beginning with r^ f N [n] = 'ip , 0 [n\. 
This observation has several important ramifications; we will see that this property 
is one of the factors leading to aliasing in digital systems. 

It will be useful to us to know that the sum of any tl> f k [n] over any period (that is, 
any set of N contiguous samples) is 0, unless the function is constant. That is, 


TV — 1 


^ e jfc(2ir/AT)n 
n=0 


= ^2 e M2n/N)n = 
n€[N) 


(N k = 0,±N,±2N,... 
10 otherwise 


(5.108) 


Equation 5.108 comes directly from the finite sum of a geometric series. 
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We now turn to representation of a discrete signal x[n] by these basis functions. 
Again by analogy to the Fourier transform, we write the signal as a linear combina¬ 
tion of the (now discrete) basis functions: 

x[n] = = E *ke? k(2l ' IN)n = (a\Tp') [N] (5.109) 

ke[N] ke[N] 

Notice that the summation only includes N distinct terms; this is because there 
are only N distinct complex sinusoids, as we discovered above. We can expand 
Equation 5.109 into a matrix equation, expressing the N unknown scalars ak in 
terms of the N known x[n\ values and the N x N matrix of exponentials. 




1 

" 1 

1 

i 

1 

1 

Xo 


ao 


1 

e 7 

e 27 

e 37 

e (7V-i) 7 

Xi 

_ 

ai 


1 

e 72 

e 272 

e 372 

g(iV —1)72 

XN- 1 


CiN- 1 


1 

e 7(*-2) 

e 2 7 (N—2) 

e 3 7 (N-2) . 

. . e (N-lMN-2) 




1 

e 7(N-l) 

e 2 7 (W-l) 

e 37(N-l) . 

.. e (N-l) 7 (N-l) 


(5.110) 


where for compactness we have substituted 


7 = 


j 27T 

N 


(5.111) 


If we write Equation 5.110 in matrix form as X = AF, then we could solve for 
the coefficients by inverting F, yielding A = XF~ l . This inversion is only possible 
if F is nonsingular. We know this condition is met because it is built from the ^'[n] 
functions, which are orthogonal; thus, the columns are linearly independent, and the 
matrix may be inverted. 

But matrix inversion is a costly process, particularly for large matrices. We 
would prefer a simple, closed-form solution for the coefficients ak in Equation 5.109 
in terms of the x[n]. 

To find this expression, multiply Equation 5.109 on both sides by e -M**/N)n 
and sum over N terms: 


^2 i[n]e--’ > ( 2,r ^ )n = E E (5.112) 

n€[JV] n6[W]fe€[N] 


Since summation is commutative, we can reverse the summation of the right-hand 
term, and pull out the now-constant a 


Y2 x[n)e-^' N ^ = E afc E 

n€[/V] Jfc€[JV] n€[N] 


(5.113) 
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Consider the rightmost summation in Equation 5.113. Equation 5.108 says that 
when k = r, it will have the value AT, and it will be 0 otherwise. Thus, the terms on 
the right-hand side will be of the form a^O or AT, the latter being the only nonzero 
term, and only appearing at r = k. So the entire right-hand side is simply a r N . This 
is the closed-form expression for the coefficients we sought: 

^ £ x[n}e-^' N > = 1 <0'|ar> tJV1 (5.114) 

ne[N] 

Together, Equations 5.109 and 5.114 show how to transform a sampled function 
x[n\ with period N into and out of frequency space. We distribute the normalizing 
term l/N symmetrically as before, writing kn = 1 />/N. We define the discrete-time 
Fourier series pair below: 


a k e^ N ^ = k n (n\i/) [N] 

(5.115) 

k€\N] 


£ X {n]e-^ 2 ^ n = K N (^\.r) [N] 

(5.116) 

n€[,Vj 



As before, the expression for x[n] is called the synthesis equation, and that for 
is called the analysis equation. 

Recall from the definition that the summation range represented by [ N ] may be 
satisfied by any N contiguous values from m mod N to (m 4- N) mod N. Suppose 
we write out the expansion explicitly for both m = 0 and m = 1: 


x m = 0 [n] =a 0 ^oW + ai^i[n]-h**- + aiv-i^5v_i[n] ^ . 

Xm=i[n]= «i^i[n] + --- + Oiv_i^5 v _i[ni + aAr^/vW 

Since x m=0 [n] = x m =i [n], and recalling from Equation 5.107 that 'ip f 0 [n\ = ^[n], 
we find 


Xm=o ]p\ - X m = i W = 0 = ao'ip'ol'n] - a N xp' N [n\ 

a 0 = a n (5.118) 

Since this hold true for all choices of m, we find that 


ak=ak+N (5.119) 

Equation 5.119 is very important; it is one of the essential characteristics of the 
discrete Fourier series for discrete signals. It comes about because there are only N 
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x[n] x[n\ 


FIOIIRI 5.28 

Building a periodic signal x[n] from an aperiodic signal x[n\. 


distinct complex exponentials with period N; thus, the discrete-time Fourier series 
contains only N unique terms. The coefficient ao represents the amplitude of the 
constant part of the signal, and larger values of k refer to coefficients on higher 
frequencies of the signal. 


5.11 .2 Hi# Discrete-Tini* Feurier Transform 

We now turn to the discrete-time Fourier transform (DTFT). This is appropriate for 
aperiodic signals. As with the continuous-time Fourier transform, the basic idea is 
to approximate an aperiodic signal x[n\ with a periodic signal x[n ] that matches x 
within one period. When x[n] goes to zero, x[n ] repeats the signal x[n] over and 
over. As the period tends to infinity, the match covers more of the input range and 
is thus more accurate. 

We begin with a finite-width signal x[n]. Without loss of generality, we assume 
that this signal is centered around 0, and x[n] = 0 if \n\ > W/2 . We construct an 
approximation x[n\ by 


( x[n ] |n| < N/2 

\ x[n mod N/2] \n\ > N/2 


(5.120) 


as shown in Figure 5.28. 

We can write the discrete Fourier series approximation for x[n] from Equa¬ 
tion 5.116: 

x[n\ = k n (a| tf')[ N ] (5.121) 

So we start by finding the coefficients 
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a k — KN (V’ I x )\-W/2,W/2] 

- k n (ip'\x) 

= k n J 2 e- jk{2ir/N)n x[n] (5.122) 

n 

The infinite limits on the last line come from the fact that x[n\ is 0 outside of 
Wj 2, by definition. As in the continuous case, we write the coefficients as samples 
of a continuous function X (ft): 

Q>k = KNX(kQ) (5.123) 

where X(ft) is defined by 

+oo 

X(Q)= £ x[n]e-^” = (^|x) 

n=—oo 

ft = 2 t t/N (5.124) 

Note that we are using ft to represent frequency for discrete signals; the time- 

frequency pairs are usually x and uj for continuous time and n and ft for discrete 
time. Thus each ak is one of a set of equally spaced samples of the “envelope” 
formed by X(Q ). We can now plug these coefficients into the synthesis formula: 


x[n] = k n (a| V’ , )[at] 

= k n ^2 K N X(kn)e jknn 

ke[N] 

= ^~ 'y X(k£l)e jnn n (5.125) 

fce[iv] 

The last step comes from noting that k 2 = \/N = ft/27r. This equation represents 
a discrete approximation to an integral, where each rectangle has height X(kQ)e jnn 
and width ft, as shown in Figure 5.29. 

As N —> oo, x[n] —> x[n], and ft —> 0. So the summation in Equation 5.125 passes 
to integration and becomes 

x[n] = ^~ f X{n)e jnn dn = K(X\ip') (5.126) 

27t y 2 7r 1 ^ 

Since X(Q)e jQn is periodic with period 27 t, any region of integration with width 
27r will work. We have now found the defining equations for the discrete-time Fourier 
transform, which are summarized below. 
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FIOURI 5.29 

The expression ^ X(kQ)e^ Qn Q approximates an integral. 
ke[N) 


i xiny nn dn = k{7\l''} [2tt (5.127) 

J 2tt 

£*[«]< jQ “ = r' , (5.128) 

n 

We will extend the notation of Equation 5.57 to discrete signals as follows: 

X(fi) = T{x[n ]} 
x[n] = 7 r - 1 {X(n)} 

x[n] X(Q) (5.129) 

Table 5.4 gives the properties of the discrete Fourier transform. These properties 
are similar to those listed in Table 5.1 for the continuous case. 

The Dirichlet conditions that specified convergence of the Fourier transform in¬ 
tegral have counterparts in the digital domain. One way to state these conditions is 
to require the discrete function x[n] be absolutely summable [326], which means it 
satisfies 

y^|x[n]| < oo 


j-[n] = 




\f2w 


n 


(5.130) 
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MOIIRI 5.30 

(a) x[0]. (b) z[l]e _jf \ (c) z[2]e -2jn . (d) Parts (a), (b), and (c) added together. 


An alternative, weaker condition requires the signal to have finite energy (this is 
weaker because some signals that are not absolutely summable can have finite en¬ 
ergy). The energy E(x) of a discrete function x[n] is defined as 

E(x) = — (x|x) (5.131) 

n 

If E(x) < oo, the summation will converge. 


I x a m p I • •« t h • DTP! 

As an example of the discrete-time Fourier transform, let’s suppose we have an input 
signal x[n] that is nonzero for only n = 0,1,2. Let’s take the Fourier transform of 
this signal: 


X(Q) = k(iI>'\x) 

= ^x[n\e~ jnn 

n 

= k (x[0] + ^[lje - - 70 + x[2\e~ 2jn ) (5.132) 

So X(Q) is built from just the first three complex exponentials. These are graphed 
in Figure 5.30. Note that although x[n] was aperiodic, X(fi) is periodic. 
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We’ll now confirm that we get back what we started with. Before we start, we 
can use property E21 to find the identity 


f e jQt dt = 
J 2n 


27re ja7r sinc(a) 


(5.133) 


We also recall that sinc(fc) = 0 for any integer k. Let’s now take the inverse transform 
of the spectrum in Equation 5.132: 

x[n) = K(x\ip') [2n] 

= K [ X(Cl)e jnn <m 

J 2tt 

= k [ k (x[0] + x[l]e~^ n + x[2]e~ 2 ^ n ) e^ n d£l 

J 2tt 

= /c 2 {^x[0] jf e jnn dnj + (x[l]J 2 e^-^dnj + (x[2 ]J e jQ{n ~ 2) dnj | 

(5.134) 

Using the identity from Equation 5.133, we write the particular value of this expres¬ 
sion for n = 0: 

z[0] = J Idfi^ + (^x[l] J e~ jQ di} S j + ^r[2] j e _2jn dfi^ | 


= — {(x[0]27r) + (47T 2 e J7r sinc(l)) + (47r 2 e 2j7r sinc(2))} 
= x[0] 


(5.135) 


because sinc(l) = sinc(2) = 0. Similar expressions for x[l] and x[2\ yield the same 
results, so we have successfully transformed our signal x[n\ into the frequency domain 
to get X(Q), and then transformed that spectrum back into the signal domain to 
recover x[n\. 


5.12 Fourier Sartos and Transforms 


We have covered four types of Fourier representations. In the continuous domain, we 
had a series and a transform definition for periodic and aperiodic signals, respectively. 
We had the same pair in the discrete domain. Table 5.2 presents a summary of these 
results in both integral and braket forms. 


ip = e 
ipk = e- 


ju>t 

K — 1 / \/27r 


jkut 

Kt — i/VT 

(5.136) 

jk(2n/N)n 

k n = 1/y/N 
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TA8LI 5.3 

Fourier transform summary. 
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Notice how similar all the braket forms appear. This is one reason we use the 
braket in this book; it clearly shows that in all domains, the Fourier transformation 
is simply the projection of a signal onto a new basis made of complex exponentials. 
The interpretation of the braket as a sum or integral, and the appropriate range of 
the operator; are the only distinguishing factors between the various forms. 

A list of some useful transform pairs for the CT Fourier transform is shown in 
Table 5.3. 

Many more Fourier transform pairs may be found in the references listed under 
Further Reading. Always be sure that you know what normalization convention is 
used by each author. We are using the symmetrical convention in this book, but the 
27r normalization factor might appear on only one equation or the other; or in the 
exponent of the basis functions. You can generally tell the convention by looking at 
one of the simple pairs, such as the transform of a constant or an impulse function. 

If you use symbolic or numerical mathematics software to compute Fourier trans¬ 
forms, you’ll again need to know how that package treats the normalization con¬ 
stants. Try a couple of simple transforms to see what convention is used by that 
software. Note that different packages within the same system may use different 
conventions. 


5.13 Convolution Revisited 

Let’s find the DTFT of a convolution sum. From Equation 5.128 we can write the 
Fourier series for y[n\: 

Y(Q) = T(y[n)} 

= «(</>' \x) 

= K^y[n}e- jnn 

n 

= K^^x[/fc]/i[n-fc]e"-' nn (5.137) 

n k 

We can switch the order of the summations, and noticing that x[k] is independent of 
n, bring it out front: 

Y (fi) = ac x \ k 1 H “ k ]e' jQn 

k n 

= kJ 2 x[k]e~ jnn H{ SI) (5.138) 

k 

The last transformation comes from the time-shifting property of the Fourier series, 
which expresses the transform pair x[n - no] e~ jQn °X(Q) 9 as in Table 5.4. 
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Time signal 

Spectrum 

m 

K 
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a<5( u>) 

K 

cos(at) 

^ [<5(u> + a) + <5(u> - a)] 

sin(at) 

y [<5(u> + a) - <5(u) - a)] 

e jat 

S(lj — a) 

K 

c -°w 

2 an 

a 2 -f w 2 



£(£ — a) 

Ke ~J wa 

Ilia (i) 


b w (t) 

f w \ 

kW sine ( uf - ] 

\ wn) 

“ nT) 

Tl 

WkL . / nW\ / 2 tt\ 

* z^ s,nc r . n t ) 

n 


Some CT Fourier transform pairs. 
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Property name 

Transform pair 

Transformation 

/[»] ^ F(0) 

Linearity 

o/[n] + 6g[t] oF(fi) + 6G(fl) 

Duality 

f[n] /(fi) 

Scaling 

f[an] -F (-} 

a \ a J 

Delay 

f[n - n 0 ] ^ e _j0n °F(n) 

Modulation 

e jn » n /W ^ F(f2 - n 0 ) 

Convolution 

/M * F(fi)G(fi) 

Multiplication 

/[n]g[n]^F(fi)*G(fi) 

Time differentiation 

£L/ [n ] ^ on)°F(n) 

Time integration 

J f[r] dr ^ ^ + irF(0)6(fl) 

Frequency differentiation 

■ ,n7, dF(fi) 

Frequency integration 

^ f F(n') dQ' 

~J n J 

Reversal 

f\~n] F(-fi) 


TAIL! 5.4 

Discrete Fourier transform properties. 


Since H(Q) is independent of k, we can move it out of the summation, and then 
notice that what’s left is the Fourier series for x\n}: 


Y (fi) = H(n)^2x[k}e- jnn 

k 

= H(Q)X(Q) 


(5.139) 


Thus we have arrived at the convolution transform pair , which by duality extends 
to the multiplication transform pair, summarized as 

X(tt)H{n) = f{i[n]*ft[n)} 
x[n]h[n] = T~ x {X(Q) * H(Q)} 


(5.140) 
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S. 14 Two-Dimensional Fourior Transforms 

The Fourier transform, in both continuous-time and discrete-time cases, can be 
directly generalized to two dimensions. The only conceptual change is that we now 
project a 2D signal onto a set of 2D basis functions. The practical result is that our 
summations and integrals become double summations and integrals, the kernels of 
the transformation become 2D, and the overall normalizing terms are squares of the 
ID terms. 

In this section we will skip over the derivation of the 2D Fourier series and head 
directly to the transform. 

Just as we chose (x, y) and [m, n] arbitrarily for the input spaces of generic 2D 
signals, we represent the frequency domain of a continuous-time signal by the Greek 
letters (/z, i/), and the frequency domain of a discrete-time signal by the Greek letters 


5.14.1 Continuous-fin* 2D Fourior Transforms 

Our new basis functions will be denoted 02 for the continuous-time case, and are 
given by 

0 2 = e ^ x + vy) (5.141) 

This family of functions has the associated normalization constants 

k 2 =. ^ (5.142) 

27T 

The 2D continuous-time Fourier transform (2D CTFT) is given by the projection 
of a 2D signal onto these 2D basis functions. Generalizing the ID case, we have 

F{fi, v) =^-JJ f(x,y)e~ j(,ix+l/v) dxdy= k 2 (folf) 

f(x,y) = ^fj F(n,u)e-^ x+l/ yUndu= k 2 (F|^ 2 ) (5.143) 

The new kernel 0 2 is separable . A function /(x, y) is separable if it is the product 
of two functions, each dependent on only x and y: 

f{x,y) = fi(x)f 2 {y) (5.144) 

The basis 0 2 is separable since 

0 2 = = eP^ x eP vy 


(5.145) 
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This observation can be a great help in practice. It means that we may write the 
analysis equation as the cascade of two ID transformations: 

F y (fi,y) = K j f(x,y)e~^ x dx 

F(n, v) = kJ F y (fi, y)e~^y dy (5.146) 

Equation 5.146 has important practical ramifications. It tells us that we can 
take the 2D Fourier transform of a signal by taking two successive ID passes, either 
transforming all the rows and then all the columns, or vice versa. The synthesis 
formula shares this property. 

Because of this property, the 2D Fourier transform inherits all of the properties 
of the ID transform, such as scaling, convolution, modulation, and so forth. 

The 2D complex Fourier transform is often illustrated by a pair of images. Typi¬ 
cally these images display the magnitude and phase of the transform, although some 
authors present the real and imaginary parts instead. The dynamic range of the 
transform is often much larger than that of photographic film and most printing 
technologies, so it is usually altered before display. Because a linear compression 
tends to lose much of the signal, most pictures of Fourier transforms instead present 
the logarithms of the magnitude and phase. A typical presentation is shown in 
Figure 5.31. 

It is not unusual to also see the transform displayed in decibels (dB). The definition 
of the decibel transformation is 

dB (x) = x d B = -20 log 10 (x) (5.147) 

To get a feeling for decibels, observe that 

dB (2x) » 6 + dB (x) (5.148) 

so doubling of the input corresponds to an increase of about 6 decibels. 

To see the 2D Fourier transform in action, look at Figures C and D in the 
introduction to this unit. They show the processing of a 2D signal, and its Fourier 
transform, through an idealized rendering system. 


Consider the 2D aperiodic box signal bw,H(x,y) shown in Figure 5.32. This signal 
is defined by 


b\v,H 



1 

0 


\x\ < W/2 and \y\ < H/2 
otherwise 


(5.149) 
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FIGURI S • 3 1 

The 2D Fourier transform, (a) An image /(x, y). (b) The magnitude of the transform: |F(/z, is )|. 
(c) The phase of the transform: Re{F(/x, v)}/ Im{F(/x, z/)}. 


Before we take the transform we will pause for a moment to consider what to 
expect. The input signal bwM{ x iV) ls a separable function, built from the product 
of two ID boxes. We know that the Fourier transform of a box is a sine, and we 
saw above that a 2D transform may be taken as a sequence of two ID transforms. 
Intuitively, we would expect the horizontal pass to spread out the horizontal box 
into a sine, and the vertical pass to do the same. Thus we would expect the Fourier 
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PIOIIII 9.32 

The 2D box bw,H (x, y) of half-width W and half-height H. 


transform of the 2D box to be the product of two ID sine functions, one along each 
axis. 

To confirm this analysis, we find the transform directly, starting with the definition 
by E21. 


= k 2 (ip2\ bw,H) 

= hjj h W,H{x,y)e-^ x +^dxdy 

1 rH/2 .WJ 2 

= -— / / e~^ x e- jl/y dxdy 

J-H/2 J-W/2 

-£[/>■*][C-H 

r H . ( H\ 1 \W . ( H/\l 

L 2 V Wj [ 2 V 27rjj 


J_ 

27T 

W H 
2-r 


. (uH\ . fnW 

sine —— sme - 

V 2tt ) \ 27r 




(5.150) 


A plot of S(/z, v) is shown in Figure 5.33. This product of two sine functions 
confirms our discussion above; note that it is not radially symmetrical. 
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HOUR! 8.33 

F{bw,N{x, y)}. 


5.14.2 DIscroto-TIme 2D Feurier Transforms 

As with the ID case, we can define discrete basis functions ^' 2 : 

= &&*/”)(**+&) (5.151) 

with the associated normalization constant 

K 2 N = jj (5.152) 

The 2D discrete-time Fourier transform (2D DTFT) is given by the projection of 
a discrete signal onto the sampled exponentials: 

Fir,, 0 = ^EE/[ m - n l e " jWWK,m+fn) = « 2 (^l /) 

m n 

/K«l4 [ f F{r 1 ,Oe-^ IN)(r ' m ^ n) dr 1 di = K 2 {F\^ 2 ) (5.153) 

to J 2tt J27r 

The properties of the continuous-time transform discussed above mostly carry 
over into the discrete domain. 
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Ixanpits The 2D Discrete Bex 

In considering the 2D DTFT, we might want to look at the transform of the discrete 
2D box 

, , i fl |wi| < W/2 and \n\ < H/2 /C1C 

b WM [m,n) = < ' . (5.154) 

10 otherwise 

We expect a result similar to the ID DTFT for a box, but again it will be the product 
of two such transforms. To confirm this, we again calculate the transform directly 
by E16. 


B[rj,£\ = (</4l 1>w,h) 

= ^ E E WK n]e-M'l N ^ m+ ^ 

m n 


W/2 , 

r H I 2 

^2 e~ jmv 

E 

'-m=-W/2 -* 

l n=-H/2 J 


1 sin [(77/2)(W + 1)] sin [(g/2)(ff + 1)] 
27T sin(r 7 / 2 ) sin(f/2) 


(5.155) 


Notice again the result B[rj, £] is not radially symmetric, but it is the product of two 
ID box transforms. 


5.15 Higher-Order Transferals 

The discussions of the previous sections can be generalized beyond one and two 
dimensions. For example, the Fourier transform in N dimensions may be defined 
in terms of the spatial vector x = (xo,x\,X 2 ,... ,xn-\) and the frequency vector 
v = (ct>o,u;i,a>2, • • • The continuous-time TV-dimensional Fourier transform 

is then given by 


F(v) = ( 2 w )~ n/2 J f(x)e~ jv x dx= k n (e~ iv ' x \ f) 

f{x) = (2n)- N/2 J F{v)e~ jv x dv= k n (F\ e jv x ) (5.156) 

Because the TV-dimensional kernels are separable in TV dimensions, the properties 
we have discussed for ID transforms all carry through into the TV-dimensional case. 

We can also define the TV-dimensional discrete-time Fourier transform in a similar 
way. 
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S.16 The Fast Feurier Transform 

The Fourier transform can be expensive to implement. In 1965 the world of signal 
processing and many disciplines based upon it experienced a revolution when the 
fast Fourier transform (FFT) algorithm was published in a classic paper by Cooley 
and Tukey [104]. This paper presented an ingenious way to structure the problem 
that made its computation for large problems efficient and practical. 

Since then, the FFT has become a staple algorithm in many signal-processing 
packages, and has been implemented in special-purpose hardware from boards to 
custom chips. An introduction to the algorithm and its implementation may be 
found in an article by Bergland [40]. 

We will not discuss the implementation of the FFT in this book. To do justice to 
the technique, and address the appropriate issues of machine precision, register size, 
and so forth, would require many details that are well handled elsewhere. These 
issues are very important and require bit-level attention to insure speed, stability, and 
accuracy. The reader who wishes more information on the FFT may find additional 
pointers to the literature in the Further Reading section. 


5.17 Further Reading 

The classic text on the Fourier series and the Fourier transform is Bracewell’s book 
[61]. The book by Tolstov [438] covers much of the same material and offers a good 
place to find alternative explanations. The Fourier transform in the context of signal 
processing is discussed by Gabel and Roberts [151], as well as by Oppenheim et al. 
[326,327]. 

The fast Fourier transform developed by Cooley and Tukey is described in a 
classic paper [104]. The FFT is summarized and surveyed in Bergland’s paper [40]. 
A more up-to-date description of the FFT may be found in the book by Reid and 
Passin [357], which includes explicit source code in the C language. 

A good general book on 2D digital signal processing is Pratt [345]. An older but 
still valuable text is Castleman’s book [77]. Digital signal processing in one, two, or 
more dimensions is covered in detail by Dudgeon and Merserau [130]. An excellent 
introduction to the design and use of digital filters is given by Flamming [185]. 


5.18 Exercises 

ExtrciM 5.1 

Prove the modulation property using duality, convolution, and the continuous-time 
Fourier transform. 
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ExtNiM 5.2 

Consider the signal f(t) = sin [(it? + 0.5)t] /sin [0.5f], which (except for a scaling 
factor) is the DFT of a box with width w. The signal blows up to infinity when the 
denominator is zero but the numerator is not. 

(a) For what values of t is the denominator zero? 

(b) What values of w will cause the numerator to be zero at all of those values of 
t ? 

(c) What can you say about the period of / based on the values for w you just 
derived? 


IxtniM 5.3 

Suppose you had the signal 


fit) 


' 0 t < 0 

2 0 < t < 1 

<4 l<t<2 

3 2 < t < 3 
,0 3 < t 


Find the analytic Fourier transform F(u>) for this signal f(t). 

(a) Derive the inverse transform of this spectrum, f'(t) = F -1 {F(u;)}. 

(b) Compare the signals f(t) and Are they identical? If not, summarize 

their differences. 

(c) Use a symbolic mathematics program to compute the Fourier transform Fm (u) 

and the reconstructed signal Make sure you determine what conven¬ 

tions are used by the program (e.g., the normalization constants may not be 
symmetrical). 

(d) Summarize any differences between the program’s definition of Fourier trans¬ 
forms and those in this chapter. If there are differences, suggest their motiva¬ 
tion. 

(e) Compare f(t) and and summarize any differences. 

(f) Compare f'{t) and /m (t) and summarize any differences. 

IxotcIm 5.4 

Consider the -dimensional continuous-time Fourier transform given in Equa¬ 
tion 5.156. 

(a) Write out the equations for the 3D CTFT in explicit form (i.e., expand out 
the vectors with explicit values). 

(b) Write the equations for the 3D DTFT in vector form. 

(c) Write out the form of a 3D convolution. 
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(d) Could you theoretically implement a 3D DTFT using a ID DTFT as a sub¬ 
routine? 

IxtrciM 5.5 

(a) Write the band-pass filter of Equation 5.105 as a product of shifted low-pass 
and high-pass filters given in Equations 5.103 and 5.104. 

(b) Take the inverse Fourier transform of the filter expression from (a). 

ExtrciM 5.6 

Compare the coefficient definition in Equation with the Gram-Schmidt orthog- 
onalized coefficient in Equation SA7. What does this tell you about the Fourier 
coefficients? 

ExtrciM 5.7 

Prove the duality of multiplication and convolution expressed by Equation 5.97. 


Little fish have littler fish 
That feed upon and bite 'em. 
And littler fish have littler fish 
And so on, ad infinitum. 

Anonymous 



WAVELET 


TRANSFORMS 


6.1 Introduction 

This chapter is devoted to wavelets. The wavelet transform is a projection of a signal 
onto a series of basis functions called the wavelet basis. 

Our motivation for studying wavelets is that for some signal analysis questions, 
they are more convenient than Fourier transforms. Many of the signals that we deal 
with in computer graphics are nonstationary. That is, they are not statistically the 
same everywhere. In our case, they possess a lot of high-frequency information in 
some places, and very little of it in others. This is the reason why the techniques of 
adaptive supersampling (discussed in Chapter 9) have been developed and work so 
well. In regions of an image with a lot of edges, textures, and shadows, we need 
to sample densely in order to account for the high frequencies in that region. If the 
image has a single flat color for a background, then we can sample very sparsely. 

The only analytic tool we have so far for analyzing the frequency content of a 
signal is the Fourier transform, in its different varieties. The Fourier basis functions 
have infinite supports they exist everywhere in the signal domain. If our signal has 
finite support, then we need to get the basis functions to cancel each other out beyond 
our region of interest. 
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The Fourier transform doesn’t allow us to look at the frequency content of just one 
piece of a signal independently of what’s happening elsewhere; we always integrate 
over the entire signal, and get back the frequency content over its entire duration. 
If there’s a high-frequency burst somewhere in the midst of a predominantly low- 
frequency signal, we will only find out that the signal contains some high-frequency 
information; the Fourier transform doesn’t tell us anything about the location or 
extent of that burst, or even that it was a burst at all and not a diffuse smattering of 
high frequencies. 

This is often undesirable. For example, consider a piece of music, particularly 
“classical” music of the Western hemisphere in the last few centuries. We know that 
over time, different pitches (or notes) are played for different durations. At a given 
time, some notes are sounding and others are not; suppose a particular note rings for 
only a second or so at two minutes into a ten-minute piece. The Fourier transform of 
the complete piece will reveal that there is some component of that note in the piece, 
but we have no idea if it occurred once, many times, or was playing continuously 
throughout. 

The desire to describe the frequency content in a short segment of the signal led 
to the development of a variety of techniques aimed at providing local descriptions 
of signals, as surveyed by Meyer [302]. 

We will see that the short-term Fourier transform provides a set of tools for 
isolating pieces of a signal and examining ranges of frequencies. But unfortunately, 
these tools are rather blunt; they use finite basis functions that are built out of infinite 
bases, and that tends to introduce some artifacts into the analysis. 

Wavelets were developed to address just these problems. The wavelet transform 
takes an input signal and projects it onto a new set of basis functions, which are 
called wavelets . The remarkable thing is that the wavelets can have compact support. 
Therefore to match a finite signal, we can put together a finite number of finite 
wavelets. The wavelets are then combined with the appropriate parameters to 
reconstruct the input signal. 

The sorts of questions we posed above, dealing with local bursts of information 
and localized frequency ranges, are easily and naturally addressed by wavelets. They 
form a hierarchy of signals from the input. The analysis property of wavelets allows 
us to look at signals from different resolutions , as well as different frequency ranges. 
We use the word resolution to refer to the number of samples in a signal; this also 
describes the level of detail in the signal. 

Wavelets form a 2D family of functions that are derived from an original func¬ 
tion, v , called the scaling function . From the scaling function we create one mother 
wavelet , and all the other wavelets spring from scaled, dilated, and shifted ver¬ 
sions of that mother wavelet. Wavelets can form an orthonormal basis and can be 
implemented quickly using a fast wavelet transform analogous to the fast Fourier 
transform. 

One principal reason for studying wavelets is that they lead to a natural technique 
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for building a compressed representation of a signal, where the smooth parts are 
represented by just a little bit of information, and the more complex (high-frequency) 
parts are given only as much additional bandwidth as they require. 

As an example, suppose that we want to compute the discrete inner product of 
two n-element vectors Vi and v 2 . The straightforward approach requires n multiplies 
and n — 1 additions. By analogy with the Fourier transform, let’s write the wavelet 
transform of a vector as W(v). Because the transform is linear, 

W(vi • v 2 ) = W(vi) • W(v 2 ) (6.1) 

Suppose that the wavelet-transformed vectors only have k < n nonzero elements. 
Then we only need k multiplies and k — 1 additions to compute the same result 
(plus the cost of inverting the result). In some special cases, we can actually get 
these kinds of savings. More generally, the wavelet-transformed vector has many 
elements that are nearly zero. By ignoring these elements (that is, setting them to 
zero), we get the speedup described above, though we have now introduced some 
error. One of the beauties of wavelets is that they are structured in such a way that 
this error can be made quite small. These savings become substantial when we move 
from vector multiplication to matrix multiplication, where the number of multiplies 
grows proportionally to n 2 where n is the number of elements on one side of the 
matrix. 

Wavelets are proving to have great value in computer graphics, where we often 
encounter signals that are mostly smooth, but contain important regions of high- 
frequency content. For example, consider an illumination function describing light 
falling on a point in space. If there is a strong light source nearby that is partly 
blocked by an opaque object, then the illumination seen from that point will fall 
from bright to dark as we sweep over the boundary from illumination to shadow. On 
both sides of this shadow edge the illumination is fairly smooth. So we can represent 
this illumination signal by just a little bit of information in the smooth regions, and 
concentrate more resources on getting a good description of the discontinuity. This 
has the same sort of appeal as adaptive sampling, where we put our resources only 
where they’re needed. This leads to more efficient rendering algorithms, because we 
can represent complicated illumination and shading functions with efficient wavelet 
representations that allow us to save on work where the signals are smooth. We will 
see examples of such algorithms in Unit III. 

The field of wavelets is new and fast-growing. This chapter summarizes the basic 
principles behind wavelets by studying one example family of wavelets called the 
Haar basis in detail. The Haar basis is particularly easy to work with because the 
basis functions are their own duals (as discussed in Section 5.2.4). In general, most 
wavelets are not orthogonal and thus not their own duals. 

This chapter is not meant to be a thorough development of the theory of wavelets 
or the wavelet transform. The intention here is to develop intuition and a general 
understanding of the principles, rather than fluency with the mechanics. Therefore 
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we will approach the same material in a number of different ways, showing the gen¬ 
eral outlines behind several interpretations of the theory. We will try to demonstrate 
the general ideas through discussion and intuition rather than formality and rigor. 


6.2 Short-Time Fourier Transform 

The Fourier transform is a powerful analysis tool for computing and interpreting the 
frequency representation of a signal. But the transform is defined over all time (or 
for an image, the entire image plane, not just the finite region where we may have 
pixel values). We repeat the definition of the continuous-time Fourier transform 
from Equation 5.55: 

X((jj) = K (lp\x) 

= nj x{t)e~ jut dt (6.2) 

Recall our convention that the limits on an integral without indices, such as this 
one, are infinity in both directions. So this formulation tells us about the frequency 
content of the entire signal. 

The complex exponentials used as the basis functions for the Fourier transform 
present a problem if we’re only interested in part of the signal. These basis functions 
have infinite extent; if a signal has a finite support in the time domain (i.e., it’s zero 
outside of some region), then we will require many terms in the transform to get the 
functions to cancel out beyond the finite interval. The coefficients on these terms are 
often very small, and they’re all important. If we decide to save storage and effort 
by truncating the Fourier series and ignoring high-order terms, we affect the quality 
of our signal description everywhere , not just in the high-frequency regions. 

There are many signals that have different frequency content in different places. 
For example, suppose you were about to embark on a 1,000-mile journey by au¬ 
tomobile, and someone mentioned to you that somewhere along the road there are 
some sharp turns and you should go slowly. You could be very conservative and 
drive slowly the whole way, but this information would be much more useful if you 
knew where the sharp turns were located; then you could slow down only where 
necessary. 

We can think of an image in the same way; the high-frequency content of a 
smooth background is rather small, but if there’s an object with a complex texture in 
the foreground, we’d expect the high-frequency content in that region to be larger. 

It’s natural to think about addressing these questions by isolating sections of the 
signal and taking independent Fourier transforms of those pieces. This approach is 
called the short-t^m Fourier transform (or STFT). 

To take the STFT of a signal, we multiply the signal with a window function 
that is typically nonzero only in the region in which we’re interested. This window 
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function (also called the analysis filter or analysis window) is central to the SI FT 
method. We want it to be zero outside the region of interest and unity within. If 
we use a simple box function, effectively just clamping the signal to zero outside the 
region of interest, then we will almost certainly introduce high frequencies at the 
edges of the window, where the signal will need to suddenly jump to zero; this is 
illustrated in Figure 6.2. 

The art to getting a meaningful STFT is finding a window function that simul¬ 
taneously isolates just the part of the signal we’re interested in (leaving that signal 
untouched), eliminates the rest of the signal, and introduces no artifacts (this usually 
means a smooth transition at the edges). These goals are mutually antagonistic. A 
common compromise is to use a Gaussian for the window function; in this case the 
STFT is called a Gabor transform . Figure 6.2 shows the result of using this window 
shape. 

Let’s look at the implications of windowing a signal. Suppose the windowed 
signal is centered at time t g . We can then write the windowed signal simply as the 
product of the signal x and the window g ; the Fourier transform of this product will 
be written X g : 


Xg(tgiv) = « J x(t)g(t - t g )e j “ l dt (6.3) 

We can see that this windowing operation introduces high frequencies into the result. 
Recall that multiplying with a box in the time domain is equivalent to convolving 
with a sine in the frequency domain; this will diffuse the spectrum because of the 
infinite support of the sine function. 

We can interpret Equation 6.3 as the signal x projected onto new basis functions 
that are windowed exponentials: 

Xg(tg,u>) = K J x(t) [g(f - t g )e~ jut ] dt (6.4) 

So these new functions, g(t — t g )e~ JU,t , form the new basis for the transform. 

Let’s pause for a moment and see how this translates to discrete signals. Recalling 
the definition for the discrete Fourier transform from Equation 5.128: 

X(fi) = K<^'|x> 

= x[n]e -jn " (6.5) 

n 

We can apply a filter g centered at sample n g : 

X g (n g , Q) = k ^ x[n]g[n - n g ]e~ jiln 


n 


( 6 . 6 ) 
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transform, (c) The resulting windowed signal and its Fourier transform. 
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FIGURE 6.3 

Two views of X g (n g , Qo). (a) Modulating x and then convolving with g. (b) Convolving x with a 
modulated window and then modulating the result. 


At a particular frequency Qq, we get 


X g (n g , fi 0 ) = « X! ( x We i0on ) 9 {n - n g ] 

n 

= k (x[n]e~^ Q ° n ) * g[n\ (6.7) 

where we’ve written the filtering operation as a convolution. We can think of this 
form of X g as taking a signal, modulating it up to fio, and then convolving it with 
the window filter g ; this interpretation is shown in Figure 6.3(a). 

Alternatively, we can convolve the signal with the filter first, as in Equation 6.4, 
writing 


X g (n g , ft 0 ) = jQon (x[n] * g[n\e jn ° n ) (6.8) 


This interpretation is shown in Figure 6.3(b). These two views lead us to quite 
different strategies for implementations. 

Returning to the continuous domain, consider that we will typically move the 
filter window in equal steps across the signal, and likewise analyze equal increments 
in frequency. Taking t = nto, where £o is the filter step size, and uj = mujo , where u>o 
is the frequency step size, then for all m, n G 2, 
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The two-parameter family of basis functions, g(t - nto)e jmu>0 \ dependent on £o and uo and 
parameterized by n and m. These form the basis for the STFT. 


Xg(n,m) = k j x(t)g(t — nto)e jmuJot dt 

= k J x(t) [ 5 (< - nt 0 )e- jmwat } dt (6.9) 

The last line of Equation 6.9 is an important link to wavelets. It shows that we 
can consider X g (n,m) to be based on a two-parameter family of basis functions, 
g(t — nto)e~ :jTnu,0 \ dependent on to and u>o and parameterized by n and m. This 
family of functions is shown in Figure 6.4. So the STFT projects the input function 
onto these basis functions. 

We say that the STFT basis functions have “compressed” support, because most 
of their energy is near the center although they never go completely to zero (since 
the windowing Gaussian never drops completely to zero). They’re something of a 
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compromise between the convenient (but infinite) exponentials and the desire for 
finite support. 


6*3 Scale and Resolution 

The Fourier transform is principally concerned with the frequency and phase infor¬ 
mation in a signal. The wavelet transform is similar in some ways and allows us to 
look at the signal from different scales and different resolutions . 

These terms already have meanings in signal processing and they mean something 
different here. It will be helpful to have an idea of what these terms mean in this 
context before we go more closely into wavelets. 

We will consider scale first. Imagine a map of a city, where 1 centimeter is equal 
to 1 kilometer. If we photoreduce the map to half its original size, then 1 centimeter 
will now correspond to 2 kilometers; the scale has doubled. Similarly, if we enlarge 
the original map by a factor of two, then we say the scale has halved. So the scale 
describes how much of the original signal corresponds to one unit (which is often the 
value associated with one equally spaced sample) of a representation of that signal. 

When applied to wavelets, resolution refers to the amount of information in a 
signal; this is related to its frequency content. If we low-pass filter a signal, we don’t 
change its scale but we alter its resolution. 

In terms of signals, suppose we have a four-element discrete signal, s 0 = [0,0,2,2]. 
We can create a new version of this signal by averaging together successive pairs of 
values, creating s i = [0,2]. Like the map that has been shrunk down, each element 
of si now covers twice the territory in the original signal as each element in s 0 , so 
the scale has doubled. If we repeat this operation, we get S 2 = [1]; the scale of 
$2 is four times the scale of so and the resolution has changed. Suppose we return 
to the original signal so , and now low-pass filter it with a simple filter, creating 
Sf = [0.25,0.75,1.25,1.75]. The scale of sj is the same as so, but the resolution of 
the signal has changed. 

These two ideas crop up all the time in wavelets, because the basic idea is to create 
a number of signals from the original, each at a different scale. Wavelets allow us 
to perform the equivalent of photoreducing a map, by building smaller signals that 
contain the same general information as the larger ones. 


6.4 The Diiotien Equation and the Haar Transform 

We will explore wavelets by discussing a single example. The canonical example for 
a wavelet basis, and the one we will use, is the Haar wavelet , named for a set of 
functions introduced by A. Haar in 1910 [302]. 
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The heart of the wavelet method is the dilation equation . The dilation equation 
tells us how to create a function from dilated, scaled, and shifted versions of some 
original, canonical function. Note that for a given function v(x), we can dilate (i.e., 
compress or widen) that function by some amount a by computing v(ax), we can 
scale the function by s by computing sv(x ), and we can shift the function by an 
amount b by computing v(x + b). Combining these, we would have sv(ax + 6). For 
some functions v, we can combine dilated, scaled, and shifted versions of themselves 
to duplicate the original. For a dilation factor of 2, the dilation equation reads 

N 

v (x) = ^ Ckv(2x — k) (6.10) 

fc=0 

The coefficients Ck are usually real for a real-valued v . We often apply the dilation 
equation recursively; to keep the generations straight, we can index the functions: 

N 

Vj( x ) = y^c fc Ui_i(2j - k) (6.11) 

k =0 


Before looking at how the dilation equation applies to functions, we pause to note 
that (with a little generalization) it also describes a class of objects called 2D reptiles 
[174]. A reptile is a 2D figure that can be assembled from several smaller copies 
of itself. A familiar reptile is the square: a square may be built by combining four 
smaller, shifted squares (in this case we leave the top and right sides, and right-side 
corners, open). Begin with a square function 5(x, y) defined by 


S(x,y,lx,ly, 



lx < x < lx + s and ly <y <ly + s 
otherwise 


We can assemble S from four smaller copies: 


S{x,y,lx,ly,s) = 


S(x.y,x, 

y , 

s/2) 

S(x,y,x + s/ 2, 

y. 

s/2) 

S(x, y,x, 

y + s/2, 

s/2) 

S{x, y,x + s/2, 

y + s/2, 

s/2) 


( 6 . 12 ) 


(6.13) 


as illustrated in Figure 6.5(a). 

In this example, we took the original square 5 and produced four new copies, 
each of which was dilated and shifted before being combined to make the original 
square. Note that the result is not a set of functions similar to the input; these scaled 
copies, added together, are equal to the input. If we generalize the scaling coefficients 
Ck in 2D to more general transformations that include rotation and reflection, then 
we can make the other three examples in Figure 6.5. 
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riSURI 6.5 

Reptiles are 2D figures that can be combined to make copies of themselves, (a) Four squares make 
a square, (b) Equilateral triangle, (c) The L-box. (d) The sphinx. 


Let’s now return to functions. One common example of a function that satisfies 
the dilation equation is a simple box y(t) of unit height over the interval [0,1] illus¬ 
trated in Figure 6.6(a). We will label this yo(t) to distinguish it from the descendants 
we will generate below. 

/x fl 0<« < 1 

yo(t) = < . . l&JD 

10 otherwise 

In the dilation equation we will set all the coefficients to 0 except Co = ci = 1. Then 
we find 


yi{t) = yo(2t) + y 0 (2t - 1 ) 


( 6 . 15 ) 
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So y\(t) is made up of two half-width boxes, as shown in Figure 6.6(b). Taking 
another step, we find that yv is made up of two copies of yi, which is equivalent to 
four copies of yo: 

y 2 (t) = yi(2t) + yi{2t - 1) 

= yo(2t) + yo(2t — 1) 4- yo(2t — 2) -I- yo(2t — 3) (6Jj6) 

This is shown in Figure 6.6(c) and (d). We can continue the recursion indefinitely. 

The dilation equation is the key to wavelets. Much work has been expended on 
finding functions (and their coefficients) that satisfy it; we will stick with the box, 
and co = ci = 1 for now. 

The dilation equation tells us how to find a function v that can be built from a 
sum of copies of itself (after dilation, translation, and scaling). A function satisfying 
this criterion is called a scaling function . From each scaling function v we can then 
create a corresponding set of wavelets , which are the basis functions for the wavelet 
transform. The Greek letter xf is often used in the wavelet literature to denote the 
wavelet basis functions, just as it’s used in some signal-processing literature to denote 
the Fourier basis functions. The wavelet literature also sometimes uses the letter w 
to represent a wavelet; we will adopt this notation to avoid confusion. The Fourier 
transform of a wavelet will thus use the symbol W. 

The recipe for constructing a wavelet from the scaling function looks very similar 
to the dilation equation. In fact, it uses the same coefficients, only in opposite order 
and with alternating signs. To construct the first wavelet (which we will call u; 0 ) 
from the basis function, we apply 

W°{t) = ^(-l) fc ci_ fc t>(2 1 - k) (6.17) 

k&Z 

So in our example, when v(t) is the unit box and all c * = 0 except cq = c\ = 1 , we 
find 

w°(t) = v(21) - v(2t - 1) (6.18) 

which is shown in Figure 6.7(a). We will see in a moment how to generate higher- 
order wavelets directly from it; 0 . To set the stage, consider what happens when we 
apply the dilation equation to it; 0 : 


N 


wt (t) = ^2 ck w% 

k =0 


(6.19) 


This is actually a number of individual wavelets it; 0 , shifted, scaled, dilated, and 
combined to form the complete set w l (t). 


MOUftl 6.6 

The dilation equation applied to the box. (a) The box function yo. (b) First generation: two copies 
of yo. (c) Second generation: two copies of y\. (d) Another look at the second generation: four 
copies of yo- 





FI6UKI 6.7 

Building up the Haar wavelets. (a) w°(t) = v(2t) — t;(2* - 1). (b) u> l (t) = w°(2t) + w°(2t - 1). 
(c) w 2 (t) = w l (2t) + w l (2t — 1). (d) w 2 (t) — w°(2t) + w°(2t - 1) + w°(2t - 2) + w°{2t - 3). 
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Continuing the recursion, w 1 and w 2 are given by 

w x (t) = w°(2t) + w°(2t - 1) (6.20) 

w 2 (t) = w 1 (2t) + w l (2t - 1) 

= w°(2t) + w°(2t - 1) + w°(2t - 2) + w°{2t - 3) (6.21) 

and are shown in Figure 6.7(b)-(d). The individual functions that we have been 
combining are shifted, scaled, and dilated versions of w°; these particular functions 
are called the Haar wavelets . 

Another set of wavelets is built from what are colloquially called the hat functions . 
These are shown in Figure 6.8. Once again, an original function, here A (t), is dilated 
and shifted. The original function is given by 

{ 2 1 0<t<l/2 

2(1 -t) l/2<t<l (6.22) 

0 otherwise 

The shifted and dilated functions are referred to as A n (£), where 

A n (t) = A(2 j t-k), n = 2 j + k j> 0, 0 < k < 2 j (6.23) 

The nonzero support of A n (t) is I n = [k2“ J ", (k + 1)2“ J ]. 

The hat functions (together with the constant function 1 and the linear function 
Ao = x) form the Schauder basis [302]. This set of bases is able to capture linear 
functions more efficiently than the Haar bases because it has two vanishing moments 
and no discontinuities in the function, discussed in more detail below. 

Wavelets may be written individually as u;° ,b , which are a two-parameter family 
of functions based on the original wavelet w° (this is sometimes called the generating 
or mother wavelet). The parameters a and b give rise to a wavelet w a ' b by the 
following formula: 

„•■*(«) = (^) (6.24) 

To see how the wavelets act as a basis, let’s begin by writing down the wavelet 
transform , the wavelet analog to the Fourier transform: 

X(a,b) = x (t) w ° (^r) dt < 6>25) 

Here a is the dilation parameter and 6 is the translation parameter. 

By analogy to the STFT, we can parameterize a and b in terms of constants ao and 
&o. The mechanism is slightly different, however. Recall that when we scale a map 
or image, we usually want to do so in equal jumps; that is, we want each successive 



FlOUftl 6.1 

The hat basis functions A n (t). 
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magnification to be an equal percentage increase over the previous magnification. 
This is a geometric rather than arithmetic progression in the magnification, so the 
dilation parameter a goes up as ao m . To ensure that the adjacent translations still abut 
at each scale, the translation parameter needs to be tied to the dilation parameter; 
when the wavelet shrinks, we need to move copies by smaller amounts. We therefore 
set b = n6 0 a 0 m . Then for a 0 > 1, b 0 > 0, and m,n e Z, we find: 

X(ao, bo, m, n) = a 0 “ m/2 J x(t)w° ~ n (6.26) 

As the value of a increases, the wavelet w a '°(t) = \a\~ 1 ^ 2 w(t/a) compresses 
together, so it is capable of representing a signal that changes quickly. As a decreases, 
the wavelet widens, reducing its ability to capture quickly changing signals and 
covering more of the domain. The translation parameter b controls the center of 
the wavelet. The way these parameters influence the wavelet is shown in Figure 6.9; 
their Fourier transforms appear in Figure 6.10. 

It’s easy to see that the Haar wavelets are orthogonal. Because the wavelets 
have compact support, we can see that any two wavelets at the same scale, w a,b 
and w a ' b + k for any k ^ 0, are disjoint, as shown in Figure 6.11(a), and therefore 
orthogonal. Wavelets at different scales are also orthogonal: for any two wavelets 
w a ' b and w a + k ' b for any k ^ 0, if k > 0, then w a + k,b will sit completely within a 
constant-valued piece of w a ' b , as shown in Figure 6.11(b). 

The Haar basis is most efficient at representing signals that are piecewise-constant. 
By efficient in this context we mean that many of the wavelet coefficients in this basis 
will be zero for such a signal. To see this, consider first of all that the inner product 
(w°’° (*)|a> of the Haar mother wavelet and a constant function a is zero. 

This is easy to see from the shape of the Haar wavelet; it is made up of two equal¬ 
sized boxes, one above and one below the axis. So if we have a constant signal, 
the DC coefficient b v (discussed below) will capture the amplitude of the signal, and 
every one of the Haar wavelets w will have a coefficient of zero. 

Now suppose we have a signal / that is made up of small constant segments. 
For simplicity, we will assume that the segments are constant between dyadic points ; 
that is, multiples of some power of two. For example, a segment might extend over 
[a2“*, (a -I-1)2“*] for some integer a. As the Haar basis functions are dilated (and 
thus cover less of the signal), eventually there will come a point where one wavelet 
will exactly cover this interval. Now we just have a magnified version of what we 
saw earlier; the projection of this constant segment on this shrunk-down wavelet will 
be zero. By the same argument, all the smaller wavelets in this interval will also have 
a coefficient of zero. Because the coefficients drop to zero after some point, we can 
ignore them: this leads to less storage and faster transformations, hence improved 
efficiency. 

We generalize this notion by computing the moments of the wavelet functions 
with respect to the monomials. We say that moment i of a function is the projection 
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FIOURI 6.9 

The two-parameter family of wavelet basis functions, \a\~ l ^ 2 w(t — 6/a) for a = ao m and 
6 = n6oao m for a range of values of n and to. The particular wavelet used in this figure is 
the twice-differentiated Gaussian, w(t) = (1 — t 2 )e~ t ^ 2 . 


of that function against the monomial #*: 


M = = [ f{x)x l dx 

Ja 


(6.27) 


If all of the moments i = 0,1,..., M — 1 are zero for some function, we say that 
function has M vanishing moments . If a function has M vanishing moments, then 
when the wavelets shrink down to the point where they cover a piece of signal that is 
a polynomial of degree M — lot less, the coefficient on that wavelet and all smaller 
wavelets will be zero. 

We will see how to use Haar wavelets to represent signals in the next section. 




noun 6.10 


The Fourier transforms of the functions in Figure 6.9. 



noufti o.i i 

(a) Two wavelets at the same scale are orthogonal, (b) Two wavelets at different scales are orthog¬ 
onal. 
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A wavelet transformation is built from the scaling function v(t ), the wavelet coeffi¬ 
cients 6° ,b , and the wavelets w a ' b (t). 

To see how this works, consider the simple function shown in Figure 6A2. This 
function x(t) is piecewise-constant on 1/4-sized pieces of the unit interval. The 
scaling function v(t) and the first two generations of wavelets, w 0 0 to w 1,1 , are 
shown in Figure 642- Suppose we have a four-valued column input vector x(t) = 
[5, -3,2,8]*; then we can write the wavelet transform of x as 

x(t) = 3 v(t) + + -3 w l ' l (t) (6.28) 


If we expand the wavelet basis functions, we can see how they contribute to the 


total: 



3 

3 

3 

3 

3 v(t) 

+ 

-2 

-2 

2 

2 


+ 

4 

-4 

0 

0 

4 w 1,0 (t) 

+ 

0 

0 

-3 

3 


= 

5 

-3 

2 

8 

x(t) 


(6.29) 


This is shown graphically in Figure 6J2(b). 

The numbers (3, -2,4, -3) in Equation 6.28 are the coefficients on the wavelet 
functions that match the input function; they are the wavelet coefficients and are 
often denoted as 6 a,b . The coefficient on v is written b v . 

The indices a and b on the coefficients refer to the level of the wavelet and its 
position . The first index, a, tells us how much the wavelet is dilated. Higher values 
of a indicate wavelets with narrower bases of support, and thus are able to capture 
relatively quickly changing information. The different values of 6 tell us where in the 
signal this wavelet is located. Since each time we increase the resolution we double 
the number of wavelets, in general for any value of a the value of b will run from 0 
to T - 1. For simplicity, we will sometimes write w 0 [t\ asw 0 ; 0 (<). 

The wavelet transform is not expensive to compute. It has a lot in common 
with the fast Fourier transform and may be very efficiently programmed; details 
may be found in Strang [423]. The principles behind the wavelet transform are 
straightforward and easy to follow for a simple wavelet. 

The basic idea is to low-pass filter the original signal to get a “smoothed” version, 
and then find the difference between the smoothed signal and the original. Then we 
smooth again, find the difference between the once- and twice-smoothed versions, 
and so on until the signal has been smoothed into a single number. All of our 
examples in this section will take place in the discrete domain using the signal 
x = [5,-3,2,8] f . 

Let’s look at the process from a high level before writing the math. In the Haar 
wavelet transform, the amplitude of the scaling function v, denoted 6 V , will represent 
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FIOUEI 6.12 

(a) A simple piecewise-constant function :r(t) on the unit interval, (b) The first two generations of 
wavelets, (c) Matching x(t) with the wavelet transform [3, —2,4, —3]. 
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the DC (or average) value of the signal, as shown at the bottom of Figure 6.13. 
Moving upward in the figure, to build up the signal from that flat function, we 
first add in a version of the biggest wavelet u; 0,0 , scaled by 6 0,0 , which will split the 
signal into two constant pieces. The next wavelets are each a half-interval wide, 
so we can think of adding to our growing stack a pair of side-by-side wavelets, 
independently scaled by 6 1,0 and 6 1,1 . Our goal is to find the 6 a,b . We will compute 
these wavelet coefficients one at a time, transforming the signal as we go. We built 
up the signal from bottom to top in the figure, but we compute the coefficients in 
the other direction. We will start with the finest detail first, and work our way out 
to the DC component. In effect we will be smoothing out the signal at each step by 
running a very simple low-pass filter over it. 

The Haar transform repeats a very simple step many times. The step takes two 
adjacent values in the signal and replaces them with two new numbers: their average 
value, and the amplitude of the wavelet that will add to the average to recreate the 
original values. In effect we low-pass filter a section of the signal, and record how 
to displace the local average (or DC) value to recover it. If the signal has more than 
two entries, we apply this process to all adjacent pairs in parallel. This requires that 
the input signal have a length that is exactly a power of two. In other words, we are 
subsampling the signal because we are stepping along the signal in units of two and 
ignoring every other one. 

At each step, we replace a pair of values p and q by their mean m and wavelet 
amplitude w. Given a pair of adjacent signal values p and q , it’s easy to find m and 


w: 


To recover p and q , 


„ p+q 
m= 2 , 

p-q 

w =- 

2 

(6.30) 

p = m + w, 

q — m — w 

(6.31) 


The values of w are the wavelet coefficients for that section of the signal. The array 
of means m are the pairwise-average values of the signal; this is then a smoothed 
version of the original. We can then repeat exactly the same process on this new, 
smoothed signal. 

We can conveniently compute m and w for all adjacent pairs simultaneously, at 
every level of the transform, with a single matrix multiplication. 

We repeat the process, generating wavelet coefficients and a smoother signal at 
each step, until we end up with a signal of length 1; that’s the constant-amplitude 
scaling function v . 

To make this verbal description precise, we’ll now give a symbolic representation. 
We will use an operator notation to keep the expressions simple. We will discuss 
operators in more detail in Chapter 16; for now, we can simply think of them as 
another way to write a function applied to a signal. 

We will encounter four operators in the following discussion, one for each step 
of the process. The operator C performs the smoothing m = (p + q)j 2, resulting 
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The input signal x = [5, — 3,2,8]*. Below that, the series of wavelets that stack together to form 
the signal. 
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in a half-sized, smoothed signal. The operator H computes the wavelet amplitude 
w = (p - q)/ 2, giving us the wavelet coefficients at that level. To rebuild the signal 
from these values, the operator £* stretches out a signal of length n to 2 n by doubling 
each entry; this turns each average-value m into two adjacent ra’s. The operator W* 
takes each wavelet amplitude w and doubles it just like £*, but it negates the second 
value, turning it into a [w, — w] pair that is ready to be added to the [m, m] pair 
created by C. The C operator acts as a low-pass filter; the C* operator is a simple 
zooming operator. The H operator is a high-pass filter; 'H * also zooms, but it negates 
the second value. In symbols, 

£(a, 6, c, d) = (a + 6, c 4- d ) 

£*(a, b) = (a, a, 6,6) 

fa — b c — d\ 

— j 

«*(a,6) = (o,-6) (6J2) 

Armed with this interpretation, we can describe the operators in somewhat more 
general terms. We will see that each one can be represented in practice by a straight¬ 
forward matrix construction. 


6.5.1 Building Hit Operators 

We begin by defining the low-pass filter C (also called the fine-to-coarse filter , the 
decomposition filter , and the restriction operator ). One way we will stress the 
intuitive interpretation of these operators is to use a nonsymmetric normalization 
term. Recall from Chapter 5 that the Fourier transform can be normalized a number 
of different ways, and we chose to use a symmetric scale factor k in front of both 
the analysis and synthesis equations, rather than k 2 (or 1/k 2 ) in front of just one 
of them. Similarly, wavelets require a normalization term, which for our examples 
would be l/\/2 on both equations, or 1/2 on just one of them. We’ll use the latter 
here because it emphasizes the averaging nature of the operator £, even though it’s 
asymmetrical. 

Suppose we have an input signal x containing 2 n entries; we will call this signal 
x( n ) (the parentheses are intended to remind us that this index is not an exponent). 
The filter may be written as an operator on a signal with 2 n entries, producing 
a new signal with only 2 n_1 entries. Each of these new i entries is given by 

evaluating Cf for a particular value of i. C is made up of the coefficients from the 
wavelet’s dilation equation: 

£ik = C2i-k (6.33) 

The whole matrix is then scaled by 1 /2. Using this formula, we can find the result 
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of applying C to x^ to find entry i of 


{Cx)(i) = 1^2 c 2 i—fcx(fc) 
z kez 



(6.34) 


Using the example x^ = [5, -3,2, 8] f (so n = 2), and for a wavelet where all the 
coefficients c* = 0 for i ^ 0,1, we apply Equation 6.34 and find 

i = 1 : (l/2)[-1- C 2 x( 0) 4- Cix(l) + Cox(2) 4- c_jx(3) + c_ 2 x(4) 4- c_ 3 x(5) H-] 

i = 2 : (l/2)[-f c 4 x(0) + c 3 x(l) + c 2 x(2) 4- Cix(3) 4- Cox(4) + c_ix(5) 4-] 

(6.35) 

where the four nonzero entries have been highlighted in bold. Eliminating the zero 
entries, we end up with a fairly simple matrix equation: 



ci co 0 0 
0 0 ci Co 


x(l) 

x(2) 

x(3) 

x(4) 


(6.36) 


For the Haar wavelet, cq = ci = 1, so the new, smoothed signal x^ is simply 


= CxW = 


i i 0 
0 0 i 



(6.37) 


Notice that this matrix averages together adjacent pairs of values, computing a/2 + 
6/2, which is the value of m in Equation 6.30. This new signal now contains only 
half as many entries as the original; we have low-pass filtered the input signal (with 
a box filter) and resampled the result at half the prior sampling rate. 

We can now repeat the process, smoothing the new signal again: 


= Cx^ = 


3 0 1 


■ 1 ■ 

1° iJ 


5 


~ [ 3 ] 


(&M) 


This sequence of signals is shown in Figure 6.14(a). 

Now we want to find a description of what changes when each signal x^ is 
smoothed to form x^ n ~ l \ In other words, if the smoother signal were “expanded” 
to twice its length (similar to pixel replication for fast zooming of an image), then 
we could subtract the lower-resolution, smoother signal from the higher-resolution 
one, and get a “difference signal.” 

In Section 6.8 we will see a formalism for describing the spaces spanned by 
the signal at different resolutions. For now, we notice that the signals a;( n ) and 
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MOURI 6.14 

The original signal = [5, —3,2,8]*. (a) The smoothed signals x (1) and each half as long 
as the previous one. (b) Each enlarged using the C m operator, (c) The difference signal lost at 
each level of smoothing. 


x (n -i) ma y b e considered to inhabit two different spaces A n and A" -1 . To find the 
difference between two signals, we would like to find a set of functions that spans 
the difference space A" -1 — A n . This isn’t always easy [90]. 

For our example, the coarse-to-fine operator C subtracts the low-resolution 
signal from the higher-resolution version directly. It is defined similarly to C, but the 
coefficients in the equation have switched position: 


{C*x)k — ^C 2 j_kx(&) k = 1,...,n (6.39) 

»€2 


All that this operator really does is make sure that we pick up each index of the 
lower-resolution signal twice before we move on to the next. Let’s use this equation 
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to form the four-element signal that blows up the two-element signal a: 1 - 1 - 1 : 


j = 1 : 1- c_jx(0) + Cix(l) 4- c 3 x(2) + c 5 x(3) H- 

j = 2: ■■■ + c_ 2 x(0) -I- cox(l) + c 2 x(2) + c 4 x(3) -I- 

j = 3 : 1- c_ 3 x(0) + c_ix(l) + Cix(2) + c 3 x(3) -|- 

j = 4 : 1- c_ 4 x(0) + c_ 2 x(l) -I- Cqx(2) 4- c 2 x(3) -I- 


which again turns into the simple equation 


C'g = 


cl 0 
cO 0 
0 cl 
0 cO 


x(l) 

x(2) 


(6.40) 




Each of these “enlargements” is illustrated in our test case in Figure 6.14(b). 

Now we can find the difference we wanted. The difference signal d^ corre¬ 
sponding to the detail lost when signal is smoothed to signal x( n_1 ) is given 
by 

d M = x (n) _ Cx {n- 1) _ (J _ C*C)x M , 71 = N, . . . , 1 (6.42) 

where X is the identity operator: Xx = x . These difference signals are shown in 
Figure 6.14(c). 

It’s now easy to recover the original function x^ from the smoothest version 
and the difference signals through the formula is simply 

x (n) = d (n) + £* x (n-l) n = ^ ? (6.43) 


This decomposition provides a nice way of looking at the signal over multiple 
resolutions, but it doesn’t explicitly provide us with the wavelet transform; that is, 
we don’t have the coefficients bi that describe the amplitudes of the basis wavelets. 
Clearly these coefficients are stored in the d^ k \ but they are also easily obtained from 
the xthemselves. 

To find the wavelet coefficients, we need only build a new high-pass filter W, such 
that HC* = Z. Like £, this filter also has a 1/2 in front. The coefficients of this filter 
are given by 

n ik = (-1 ) k+1 c k+ i-2i (6.44) 

For example, for the two-coefficient case we find 



co -ci 0 0 

0 0 co —ci 


(6.45) 


This filter gives the coefficients on d^ when passing from step n + 1 to n. 
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lion 


2 7 1 


For the Haar wavelets, this matrix becomes 


n = 


0 


0 

1 

2 


(6.46) 


Notice that these are just the values needed to find the coefficient w in Equation 6.30. 
We can now state the complete algorithm for finding the wavelet coefficients. 

Decomposition: Given a signal containing 2 N components, 
for n = TV, • • •, 1 compute 


x (n -^ = £x (n) 

6 (n_1) = Hx {n) (6.47) 


Reconstruction: Given x and 6 (0) ... b^ N for n = 1,..., TV 
compute 

x {n) = C*x {n ~ l) + (6.48) 

You may notice that the matrices C and % have a premultiplied value of 1 /2, while 
the “reverse” matrices C* and H* don’t. As I mentioned before, this is a normalizing 
factory just like the 1/\/2tt factor in the Fourier transform. We can distribute the 
1/2 by using a factor of l/v^2 on all four matrices. I chose to leave the equations 
asymmetrical because it makes it easier to see how the Haar coefficients appear in 
the matrices. 

This algorithm can be thought of as a tree or pyramid, as diagrammed in Fig¬ 
ure 6.15. 

A more conventional signal-processing view of the algorithm is given in Fig¬ 
ure 6.16. Here the notation t 2 means that a signal is upsampled by a factor of 2 by 
inserting zeros; for example, [a, 6] turns into [a, 0,6,0]. Similarly, | 2 means that a 
signal is downsampled by a factor of 2 by ignoring every other sample; for example, 
[a, a, 6,6] turns into [a, 6]. This form of the wavelet transform makes use of three 
operators in addition to the upsampling and downsampling operators. For the Haar 
wavelets, the operator A averages its input: 


A fjfcl / + X ^ ~ ^ ls °dd 

y \ (x[k 4-1] + x[k - l])/2 A; is even 


the operator C copies: 


C : 



x[k] 

x[k — 1] 


k is odd 
k is even 


and the operator 72 replicates with alternating sign: 


72: 



#[fc] 

— x[k — 1] 


k is odd 
k is even 




(6.50) 


(6.51) 
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x iN) 



FIOURI 6.15 

The decomposition of a signal into wavelet coefficients and smoothed versions (top half). The 
reconstruction of a signal from smooth versions and wavelet coefficients (bottom half). 





6.5 Decomposition ond Reconstruction 


273 


14,4,2,2. 3, 3,7,7,] 14, 2, 3,7J J3, 3,5, 5] [3,5J J4,4] [4] 


1 14, 01 14, 4J 13, 5] 13, 0, 5. 01 {3, 3, 5, 51 |4, 0, 2,0, 3, 0, 7, 0] 



The pyramid algorithm in signal-processing terms. 


Notice that in Figure 6.16 the input signal xg[n] with m samples is split into two 
new signals, with m and m/2 samples, respectively. The new, half-length signal x\ [n] 
is then operated on by the next step. If it requires U units of work to calculate one 
stage of the transform on xq , then it takes U /2 units to transform x\ y and U / 4 units 
to transform X 2 . In the limit, 


[/+j + j + ---<2U (6.52) 

so the total amount of work required is (asymptotically) linear in the size of the 
input array. No invertible transform can be any less expensive than this (though in 
practice the cost of each step can vary among transforms). 
























Wavelet compression, demonstrated by successively adding in terms of increasing magnitude, 
(a) The original signal, (b)-(i) Increasingly close fits to (a) using more wavelets. (Continued 
on next page.) 


6.6 Compression 

The wavelet decomposition in Figure 6.16 is a good example of how wavelets are 
useful for compression. The eight-element input signal in Figure 6.17(a) is turned 
into eight wavelet coefficients: 







6.6 Compression 


275 



U HJOUP ILeJUT (continued) 

Wavelet compression, demonstrated by successively adding in terms of increasing magnitude, 
(a) The original signal, (b)-(i) Increasingly close fits to (a) using more wavelets. 


[1,7,3,1,6,0,9,5] [6*\ b°’°, 6 1 ’ 0 ,6 1 1 ,6 2 - 0 ,6 2 ’ 1 , fc 2 ’ 2 ,6 2 - 3 ] 

= [4,-1,1, -2, -3,1,3,2] (6.53) 

The technique of compression is used to create an approximation of the signal with 
fewer values (say m) than the original (say for transmitting a rough idea of the signal 
over a channel where sending the full signal is too expensive). One way to compress 
is to take the Fourier transform of the signal and then retain only the first m Fourier 
coefficients. That makes sense because the first few Fourier terms represent the 
low-frequency information in the signal, and higher terms add in higher frequencies, 
although as I mentioned earlier, loss of the high-frequency information will affect 
the entire signal. 
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In the wavelet transform, we don’t want to keep simply the lowest-order terms, 
but rather those with the largest magnitude, wherever they may be. For example, 
if we could only save one of the eight terms in Equation 6.53, which would it be? 
If we estimate the error E in any m-term approximation s m of s by the sum of the 
element-by-element errors, then all our error terms will be integers (this will make 
it easy to compare various error figures; other error metrics are addressed in the 
exercises). The error E is then given by 

E = ^2\s[k]-s m [k) I (6.54) 

k 

If we compute this error for each of the eight coefficients in Equation 6.53, then we 
find these measures (using the same order): 

Ei = [22,34,32,32,32,32,32,32] (6.55) 

So the best one-coefficient approximation s\ to 5 has an error of only This is 
given by b v v, as shown in Figure 6.17(b). Notice that this is also the coefficient with 
the greatest magnitude. 

Our signal si = b v v is now an eight-element vector with a constant value of four. 
What’s the next best term to add in? We can compute the error between s and Si, 
plus each of the remaining wavelets to find: 

E 2 = [#, 34,18,20,20,16,22,18,22] 

So s 2 = si 4- & 2,0 w 2,0 represents the best two-coefficient approximation to s, as 
shown in Figure 6.17(c). Observe that the largest coefficient magnitude available at 
this step was 3, and only 6 2 0 and 6 2,2 are of this size. We might then expect that 6 2,2 
is going to be our next best choice, and indeed we can write out the new errors to 
confirm this: 

E 3 = [#, 14,16,14, •, 16,12,16] (6.57) 

The rest of the development is shown in Figure 6.17(d)—(i). 

Note that it is important to keep track of which wavelet is saved at each step. We 
might keep a list of pairs, the first element representing the wavelet index for that 
coefficient, and the second representing that coefficient’s value. Our example would 
be 

{ (4,1), (-3,5), (3,7), (-2,4), (2,8), (-1,2), (1,3), (1,6)} (6M) 

where wavelet index 1 stands for v, index 2 is w 0,0 , and so on, so index 8 stands for 
w 2 ' 3 . Of course, if we save all these pairs we’ve expanded our storage rather than 
compressed it, so typically only the first few pairs are retained; wavelet coefficients 
for real-world signals seem to often decrease in magnitude very quickly, making this 
form of compression attractive. 
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6.7 Coefficient Conditions 

We can roll together the matrix operations from the previous sections into a single 
interleaved calculation. When we look to invert this matrix equation, we will find 
that it requires some conditions on the coefficients, which in turn influence the types 
of functions that can be used to satisfy the dilation equation, which then influence 
the types of wavelets we can build. 

Notice that both the £ and H operators downsample their inputs by a factor of 
2; that is, their outputs are half as big as their inputs. Therefore the matrix forms 
of these operators have dimensions n x 2 n. We can put them together, interleaved, 
to create a single composite matrix A . For a wavelet with four nonzero coefficients 
Co, Ci,c 2 , C 3 , this becomes 


Co 

Cl 

C2 

C3 



C3 

c 2 

Cl 

-co 





co 

Cl 

C2 

C3 



C3 

c 2 

Cl 

-co 




co 

Cl 

C 2 

C3 



C3 

c 2 

Cl 

-co 

C 2 

C3 



co 

Cl 

Cl 

-Co 



C3 

c 2 


(6.59) 


To invert the transform, we need to find A~ l . Recall that the inverse of an 
orthogonal matrix is equal to its transpose [420]. We have seen that the wavelet 
basis is orthogonal, so it is reasonable to consider A 1 as a candidate for A~ l : 


Co C 3 
Cl ~C 2 


C2 

Cl 

co 

C3 



C3 

-co 

Cl 

c 2 





C2 

Cl 

Co 

C 3 



C3 

-Co 

Cl 

c 2 


C2 Ci 
C3 “"Co 


(6.60) 


C2 Cl Co C 3 
C3 -Co Cl -c 2 


Notice how the wrapped-around tail ends of the coefficients show up in the upper- 
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2 2 2 2 
a = co +ci +c 2 +c 3 

(3 = CqC 2 4- C 1 C 3 


In words, we have a main diagonal of a, with diagonals of (3 one column to the 
left and one to the right (or, equivalently, one row above and below) of the main 
diagonal, with appropriate wraparound effects. 

The product A A 1 will be the identity (and thus A 1 will be confirmed as A -1 ) if 
a = 1 and (3 = 0 in Equation 6.61. These are the first two conditions we will demand 
from our dilation equation coefficients: 

1 Co 2 4- ci 2 4- c 2 2 4- c 3 2 = 1 

2 coc 2 4-cic 3 = 0 

Notice that the Haar matrices used in the previous section don’t satisfy these 
conditions. As noted earlier, they could be adjusted by a scale factor of l/y/2. Then 
we’d have Co = c\ = l/\/2, so a = co 2 4- ci 2 = 1, as desired. And since c 2 = c 3 = 0, 
we have (3 = 0 as well. 

The Haar wavelets have only zero-order matching properties; that is, they can 
only match functions that are piecewise-constant. When building up to the integral 
in calculus, we start with the zero-order rectangles to approximate an area under 
a curve, but we then move on to trapezoids, which have first-order matching; that 
is, they can exactly match any piecewise linear function. Higher-order curves (e.g., 
quadratics and cubics) give us even better continuity. 

When discussing matching, it is important to note that it is combinations of the 
scaling functions that actually match (or duplicate) the function. The wavelets that 
are derived from the scaling function may individually have very strange shapes (such 
as that in Figure 6.18), but they combine in just the right way to make something 
much less bizarre and more regular. 
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PIOURI 6.16 

(a) The scaling function v generated by D 4 . (b) The first wavelet w°'° generated by D 4 . 


We say that the order of the wavelet is the type of signal that can be matched 
by the scaling function. The regularity of each function is the number of times the 
function is continuously differentiable. 

We can move from the zero-order continuity offered by the Haar wavelets to 
first-order continuity by imposing two additional conditions. For a four-coefficient 
wavelet: 


3 c 3 - c 2 + ci - Co = 0 

4 0c 3 — IC 2 4- 2ci — 3co = 0 

These two conditions respectively enforce the first two vanishing moments. When 
these two conditions are joined to the two others above, we get a set of four coef¬ 
ficients that generate wavelets that can match any linear function [302]. These are 
called the Daubechies first-order wavelets , sometimes written Z) 4 , and are given by 


Co - j (i + V?) 

ci = - (3 - 1/3) 

00 = 5(3+^) 


« 0.68301 
» 1.18301 
a 0.31400 
« -0.18301 
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The scaling function v and first-order wavelet w 0,0 based on these coefficients are 
shown in Figure 6.18. The coefficients for higher-order wavelets may be found 
in Daubechies’ book [115]. Note that the coefficient values in the literature often 
include the 1 />/2 normalization factor. 

In general, the order of the wavelet describes its ability to match a polynomial. 
As we saw earlier, we can define the order of a wavelet in terms of its projection onto 
the monomials (that is, its vanishing moments); a wavelet w a,b (x) of order p satisfies 

J x m w a ' b (x) dx = 0, 0 < m < p — 1 (6.63) 

We can summarize the coefficient conditions for a wavelet of order p generated 
by this type of construction, as discussed in Xu and Shann [493]: 


o 

II 

for k £ {0,1,..., 2p- 1} 


Yl Ck = 2 

k 



£(-l ) k k m c k = 0 
k 

for 0 < m < p — 1 


^rc k C2 k -m = 26 (m) 

fori — p < m < p — 1 

(6.64) 


k 


The coefficients cq = ci = 1 for the Haar wavelets correspond to p = 1 , and the 
Daubechies coefficients in Equation 6.62 correspond to p = 2. 

Although the first wavelet from D± appears in Figure 6.18, we haven’t seen yet 
how to actually compute the values of the wavelet function. It is remarkable that we 
don’t actually ever need to find the wavelets at all! As long as we have the coefficients, 
then we can carry out the wavelet transforms described above, and decompose and 
reconstruct our signals. 

The reason we don’t need the wavelets explicitly is because we really only need 
to find the projection (that is, the inner product) of the signal with the various 
wavelet functions. It may be surprising, but we can compute those inner products 
without actually generating the wavelet function itself [429]. The idea is to create an 
integration rule from just the coefficients. The integration rule is based on sampling 
the function at a number of points and weighting those samples (we will have much 
more to say about integration methods in Chapter 16). If the rule is not very good, 
then we will get a result that matches the signal at the sample points but might be 
a very poor approximation between them. More sophisticated rules are better able 
to estimate the signal, and thereby compute the inner product of the signal with a 
weighting function. Sweldens and Piessens have shown how to compute high-quality 
inner products with just the wavelet coefficients [429]. 
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It is interesting to think about the wavelet functions themselves; Figure 6.18 has 
an intriguing shape. When evaluating wavelet functions, we need to be careful, since 
we have almost no guarantees on the shape, range, or continuity of the functions 
generated by a particular set of coefficients. 

We can make the job easier by limiting our attention to values of the wavelet 
function at dyadic points; those are given by n2 _m , for n,m e Z. We can find 
these points by recursive midpoint subdivision. If we know the value of the scaling 
function v at the integers, then we can use the dilation equation to find the values at 
the half-integers. From those values, we find the values at the quarter-integers, and 
so on, recursing to any dyadic point of interest. So the only remaining trick is to find 
the starting points. 

To find the starting values, we can look for fixed points of the recursion [90]; 
that is, points that don’t change in value as the dilation equation is applied. We find 
these by writing the dilation equation as a recursive matrix equation and finding the 
eigenvectors of the matrix. These eigenvectors are the values of the points that are 
passed through by the matrix unchanged except for scale. 

Let’s find the endpoint values for D 4 . We begin by writing the dilation equation 
from Equation 6.10 for v\ and vo: 

M. /\/\A/ /VN/VN/N/' 


v(l) = c 0 v(2) + Civ(l) + c 2 v( 0) H- c 3 r(-l) 

v{2) = cqv{ 4) 4- c\v(3) 4- c 2 v(2) 4- c 3 v(l) (6.65) 


We only wrote the values for the nonzero coefficients cq ... c 3 . We’ve marked in bold 
the nonzero products; the scaling function v is zero outside the interval [1,2]. The 
result is that we can write a small recurrence equation: 


' r(l) ' 


C\ Co 1 

[ »(i) ' 

v{2) \ 


_ c 3 C 2 J 

[ v(2) J 


( 6 . 66 ) 


which we can write as v = Mv. We’ll now find the eigenvalues Ai, X 2 of M. Recall 
that these are the solutions to det(M — AX) = 0 [420], We write 


det 


ci — A Co 
c 3 c 2 - A 


= 0 


(6.67) 


For the coefficients of D 4 in Equation 6.62, there are two eigenvalues associated 
with this matrix: Ai = 1 and X 2 = 1/2. The corresponding eigenvectors are 


v\ 


1 4- \/3 
-1 + V3’ 


[4cq, 4c 3 ], v 2 = [-1,1] 


( 6 . 68 ) 


These vectors give us the two values that allow us to find all the others. In other 
words, we have found that = [4cq,4c 3 ]. From these two values we can 

use Equation 6.11 to find the value of v at all other dyadic points. 












282 


6 WAVELET TRANSFORMS 


For example, to find v(3/2), we write 

u(3/2) = cov(3) + c\v(2) + c 2 v( 1) + c 3 v(0) 

= civ(2) + c 2 v( 1) 

= ci4c 3 + c 2 4co 

= 0 (6.69) 


6.8 Muitiresoiutlon Analysis 

Each application of the low-pass filter C in the previous section halves the number of 
samples representing our signal. We can think of this series of signals as representing 
the original signal at a variety of resolutions . 

The technique of multiresolution analysis provides a formalism for discussing this 
property of wavelets. We can think of the input signal as belonging to a space of 
signals, and then we construct a nested chain of such spaces, each one containing 
the signal at a lesser resolution. 

The scaling function v(t) implies a space Vo, which is the space of all functions 
cv(t) for some constant c. Similarly, the first wavelet, w°(t) = implies a 

space Wo, which is the space of all functions cw°(t). By construction, these two 
spaces are orthogonal: 

J [av(<>] [6u>°(t)] = 0 (6.70) 

If we combine the two spaces by a Cartesian sum, we generate a third, new space, 

Vi: 

^ = V 0 © W 0 (6.71) 

This new space contains the functions that are combinations from the two subspaces: 

Vi : f(t) = av(t) + bw°(t) (6.72) 

These are illustrated in Figure 6.19 for the Haar wavelets. So Vi contains all functions 
built from two individually scaled, dilated, and translated copies of the original 
scaling function. 

There are two essential points to notice about our new space of functions, V\. 
The first is that it was built by combining one space with a second, orthogonal space. 
For example, consider the space Po, which contains all polynomials of order 0; that 
is, all constant functions fo(x) = co for some co € H. We might have another space 
Pi, which contains all first-order polynomials, or linear functions f\(x) = c\x + Co 
for d, cq € 72, and a space P 2 for quadratics f 2 (x) = c 2 x 2 -I- c\x + co, and so on. 
We can build up as many spaces as we like, where each Pk contains the polynomials 
fk(x) = £*LoCiX\ 



MOURI 6.19 

(a) The space Vo = av(t). (b) The space Wo = bw°{t). (c) The space V\ — Vo © Wo = 
av(t) + bw°(t). 
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FIGURE 6.20 

(a) The space V\. (b) The space W\. (c) The space V 2 = V\ ® W\. 


The second important point about this construction is that the spaces are nested . 
All the functions fk{x) in the space P* are contained in the larger space Pfc+n> for 
all n > 0. We write this sequence of nested spaces as Po C Pi C P 2 - In terms of 
our construction of Vi, we find Vo C Vi, since all functions av(t) are in the space of 
the functions av(t) + bw°(t). In general, we can build a sequence of closed, nested 
subspaces Vi, such that 

• • • c V _2 C V _1 C Vo C Vi C V 2 C • • • (6.73) 

We are now ready to describe the multiresolution framework for wavelets. Each 
space V m is made up of all linear combinations of the functions v(2 m t — fc), and 
each space W m is made up of all linear combinations of the functions w°(2 m t - k). 
For example, the space V 2 contains four copies of the scaling function, each quarter- 
width and individually scaled. We get this by combining V\ with Wu which is the 
set of quarter-wavelets, as shown in Figure 6.20. We can write this as 

V 2 = Vi 0 Wr 

= Vo®W 0 ®Wi (6.74) 

In general, we build each space V m +i by combining the space V m with an orthogonal 
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space W m , recursively working our way back to V 0 : 


Vm+\ ~ Vm 0 Wm 

= Vo © Wo © W x © • • • © (6.75) 

We will summarize these properties with respect to all functions in the class 
/ e L 2 (R) y that is, all functions that satisfy J[f(t)] 2 < oo. In general, a multiresolu¬ 
tion analysis is a set of closed subspaces V m , m G Z, with the following properties: 

1 Containment: V m C Kn+i. 

2 Distinctness: f| m V m = 0. 

3 Completeness: |J m Kn = L 2 (R). 

4 Shifting: if f(t) G V m > then /(< - k) £ V m for all k € Z. 

5 Scaling: if /(*) G V m , then f(2t) G V m+i , for all / G L 2 (fl). 

6 Basis: There exists a scaling function v(t) G Vo, such that for all m, the set of 
functions 

{vmn(0 = 2~ m ^ 2 v(2~ m t — n)} 
forms an orthonormal basis for V m ; that is, 

J V m p(t)v m q(t) dt = {q 

Wavelets work out so well because these properties are inherently satisfied by 
design. In particular, the dilation equation implies containment and scaling. 


6.9 VtavtltH In the Fnurinr Domain 

Let’s look at the dilation equation in the Fourier domain. Taking v(t) from Equa¬ 
tion 6.11 and plugging it into the Fourier transform from Equation 5.55 to find its 
transform $(a>), we find 


$(u;) — k, J v(t)e jwi dt 

— k f ^Cfci;(2£ - k)e~ jwt dt 

J k 

= K^2 c k ( v(2t — k)e~ juJi dt 
L. J 


(6.76) 
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Setting y = 2t - k, we find 


S(u>) = K^c k f ii{y)e~ juk/2 e- juiy/2 dy 

k ** 

= k Cfce~ jfcu; / 2 f v(ii)e~i yuJ t 2 dy 

k J 

= c fc e~- , * :w/2 ^ Jv(y)e~ jyu,/2 dy S j 


= P{u/ 2)$(w/2) 


(6.77) 


where we have set the symbol P(w) = c k e~ jklAj . There is a natural recursion in 
Equation 6.77, inherited from the dilation equation. Following this recursion, we 
see that if $(w) = P(w/2)$(w/2), then $(w/D = P(w/2)[P(u>/4)$(u>/4)], and so 
on, leading us to conclude 

OO 

=n p (S) < 6 - 78 > 

fc=l 

The conditions on the coefficients we mentioned earlier have counterparts in the 
frequency domain. For example, the condition in Equation 6.64 can be specified in 
the frequency domain by requiring that P(u) has a zero of order p at u = n; that is, 
P{u;) contains a term 1 /(w — i r) p [90]. 

Consider now what happens to the Fourier transform W(lj) as a wavelet w(t) is 
dilated. We know from the delay property of Table 5.1 that when w(t) and W(uj) 
form a Fourier pair, then 

w ab (t) = -]=w ( — W ab (u) = y/a W (aw) e~ jbul (6.79) 
y/a \ a J 

This is as expected; as the wavelet is stretched in time (a < 1), its Fourier transform 
compresses, and vice versa. 

We can state the Parseval relation for the wavelet transform that mirrors the 
relation for the Fourier transform in Equation 5.64. The relation involves a constant 
C w , which is central to a formula called the resolution of identity. This is basically a 
compact statement that we can reconstruct a function from its wavelet coefficients, 
the wavelet functions, and the constant C w . The resolution of identity is 

1 poo poo 1 

x(t) = — / — b a ' h w a ' h {t)dadb (6.80) 

Ctu 7-oo 7o a 


The equation says we can find x(t) by scaling each wavelet w a ' b (t) by its correspond¬ 
ing coefficient 6 a,b , and scaling that product by the inverse-square of the dilation 
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coefficient a. The result is equal to x(t) except for a constant, called C w and given 

C..r (6.81, 

Jo V 

The Parseval relation may now be written as 

C w J \x(t)\ 2 dt = jj ± \b°' b \ 2 dadb (6.82) 


It is instructive to compare the time-frequency localization of the wavelet trans¬ 
form and the short-time Fourier transform, as in Figure 6.21. In this figure, we 
put a dot at the center of each transform; the horizontal position is the time at the 
transform’s center, the vertical position is its frequency. Notice that in the S I FT, the 
dots are regularly spaced. We position the window at a time nto; integer values of 
n yield equal horizontal spacing. We take the transform at frequency rau> 0 , yielding 
equal vertical spacing. So once we’ve picked our values of to and we generate 
a whole family of transforms at equal increments. The reason that the pattern is so 
regular is because we have a constant window for all times and all frequency ranges. 

The figure also contains the pattern for the wavelet transform. Notice first how 
the horizontal spacing adjusts to the frequency; as the frequency goes up and the 
time-domain support of the wavelet decreases, copies of the wavelet need to be 
located closer together in order to cover the domain. Also notice that the spacing in 
frequency space moves geometrically. This implies that the ratio of the bandwidth of 
the wavelet to its center frequency is constant. In other words, if we plot frequency 
on a logarithmic scale, the frequency response of each wavelet has an equal shape. 
In electrical engineering, this is called a constant-Q resonant filter . 

We can compare wavelets and short-time Fourier transforms in another way as 
well. A common method for displaying the frequency content of ID signals is to 
use the S I FT to compute a spectrogram. Here we plot the magnitude of the signal’s 
S I FT along a line representing the signal. This is a common technique for displaying 
the spectral content of time-varying signals such as speech and musical sounds, where 
the signal is swept over time. An example is shown in Figure 6.22(a). The vertical 
columns are generated by the different temporal windows as they are positioned in 
equal time increments. An alternative view of the same information is to look at the 
frequency information of the signal over all times, as shown in Figure 6.22(b). This 
represents the filter bank approach to displaying the frequency content of a changing 
signal. 

The corresponding diagram for wavelets is called the wavelet spectrogram , or the 
scalogram [358]. Here we plot the magnitude of the wavelet response at different 
scales. Figure 6.23(a) shows the scalogram corresponding to an impulse function 
S(t - to ). In the scalogram, we see a high response at very fine scales, isolating the 
impulse accurately. As the scale becomes larger, the impulse diffuses, resulting in 




1011 


(a) The ST FT lattice. The horizontal axis is time, the vertical axis is frequency. Each increment of 
m, n causes equal steps on both axes, (b) The wavelet lattice. 
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AAA 


(a) 



(b) 


PIOURI 6.22 

(a) Spectrograms. The signal is plotted horizontally, and the magnitude of the transform is indicated 
by color (black is a small value, white is large). The STFT gives the frequency content of the signal 
within different time windows, (b) Filter banks. Here we see the amount of spectral energy in the 
entire signal for different frequency ranges, as represented by different filters. 


a cone-shaped region. Figure 6.23(b) shows the STFT spectrogram for the same 
impulse; notice that the time localization is limited to the A t width of the analysis 
window. Another scalogram/spectrogram pair is shown in Figure 6.23(c) and (d). 
Here the response is to a set of three sine waves, at / 0 , 2/o, and 4/o. Notice that the 
frequency resolution in the spectrogram is a constant A/ resulting from the fixed 
window, while the scalogram’s response enlarges with increasing scale. 

So wavelets adapt to the frequency at which they are applied. At low frequencies, 
they include large sections of the signal, and they are spaced far apart. At higher 
frequencies, the wavelets are packed more closely together and include smaller pieces 
of the signal. 

This behavior is just what we specified at the start of the chapter. When we’re 
interested in the low-frequency content of a signal, we use a wide window and include 
a lot of the signal; then we move the window a far distance and repeat. When we 
want to analyze the high-frequency content, we shrink the window to include just a 







WAVELET TRANSFORMS 


290 


6 





FIOURI 6.23 

(a) The scalogram for an impulse function, (b) The spectrogram corresponding to (a), (c) The 
scalogram for a sum of three sines, (d) The spectrogram corresponding to (c). Redrawn from Rioul 
and Vetterli in IEEE SP Magazine. 
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bit of the signal, and move the window in much smaller steps. This is difficult to do 
with an STFT, but wavelets have this behavior built in; that is their main strength. 


6.10 Twe-Dimevtsienai Wavelets 

Just as we built Fourier transforms for 2D signals, we can also build 2D wavelet 
transforms. These are immediately useful for image storage and compression, but 
we will see in Chapter 16 that 2D wavelets are also very useful for representing 
matrices, in particular the matrices that are part of the basic equations describing 
how light moves throughout a scene. 

There are a variety of methods for constructing multidimensional wavelets; point¬ 
ers to some of these may be found in the work of Jawerth and Sweldens [229]. In this 
section we will focus on two of the most popular methods, called the rectangular and 
square deconstructions (though the less descriptive names standard and nonstandard 
have also been used [42]). We will discuss them in turn below. 

Both sets of methods rely on forming the tensor product of two one-parameter 
functions a(t) and b(t). Let’s call the 2D coordinates x and y. The idea is that one of 
these functions is passed the x parameter, and the other the y parameter, and the value 
of the 2D function is the product of these two independent function evaluations; for 
example, the value at {x,y) is a(x)b(y). We write this new two-parameter function 
as (a (g> b)(x, y). 

In all of our discussions below, we will assume that we have been given an input 
signal at a resolution of 2 N x 2 N . This signal may be a sampled 2D function, a 
matrix, an image, or any other real-valued 2D data. 


6.10.1 Tht Rectangular Wavelet Decomposition 

Perhaps the most straightforward multidimensional wavelet is given by the rectan¬ 
gular (or standard) form. The idea is to make tensor products of all the various 
functions involved in a ID wavelet transform, and then use those functions for the 
2D signal. 

In other words, suppose we were given a four-element input vector x[n]. To 
match this with the Haar basis, we would have coefficients on the scaling function 
v = (1,1,1,1), the four-element wavelet w°'° = (1,1, —1, —1), and the two two- 
element wavelets w 1 ' 0 = (1, -1,0,0) and w 1 ' 1 = (0,0,1,-1). To make the 2D basis 
set, we build the sixteen tensor products that come from the combination of these 
four functions. Since each input function is four elements large, the resulting basis 
functions are 4x4 matrices. Figure 6.24 summarizes these sixteen functions. 

Notice that these functions have forms that are very similar to the ID situation. 
We have a single function that takes an average (v ® v), and fifteen other functions 
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PIOIIRI 6.24 

The sixteen basis functions for the rectangular wavelet decomposition of a 4 x 4 matrix. Each 
function is made by the tensor product of the function in the top row with the function on the left; 
for example, the third function on the top row is given by u; 1,1 0 w 0,0 . 


that sum to zero but encode local differences (or high-frequency information) in the 
signal at different scales. By design, the term-by-term sum of each basis function 
(except v 0 v) is zero, and the sum of the term-by-term products of any two different 
bases is zero; that is, the bases form an orthogonal set. 

The new wrinkle in 2D is that we compute the difference information in several 
directions. In particular, scanning the table shows that we compute high-frequency 
information distributed in three different directions: left to right (e.g., w 1,0 0 v), up 
to down (e.g., v 0 w 1,0 ), and diagonally (e.g., w 1 ' 0 0 tu 1,1 ). 
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FIOIIII 6.28 

An example decomposition in the rectangular wavelet basis. 


To transform a 2D signal into this basis, for each basis we simply multiply the 
signal term by term, sum the result, and divide by a normalizing factor (here equal 
to the number of nonzero terms in the basis function). Note that this all works only 
if the wavelets are their own duals, as in the case of the Haar bases. 

Figure 6.25 shows an example transformation under this basis. The matrix M 
has been designed to demonstrate the various types of directional information that 
this basis picks up most efficiently; note that five of the sixteen wavelet coefficients 
are zero. To illustrate the procedure, for the upper-left coefficient we find 

(1 + 2 + 2 + 1 ) — (4 + 5 + 4 +5) — (2+ 1 + 3 +6)+ (3 + 3 + 3 +3) _ 3 

16 “4 


6.10.2 TVt6 Square Wavelet Decomposition 

The basis functions in the last section had both square and rectangular regions of 
nonzero elements. This is useful when we want to construct a decomposition that is 
anisotropic; that is, we are concerned with different scale ranges along different di¬ 
mensions. If we treat all dimensions the same way, we can construct a decomposition 
that has only square terms; it’s called the square (or nonstandard) decomposition. 

Once again we will build the basis functions from tensor products of the scaling 
function v and the wavelets w a ' b . This time, though, we will be guided by an analogy 
to the multiresolution analysis discussed in Section 6.8. 

We begin with the simplest 2D basis function formed by v <g> v; that is, a box. 
Recall that translates of v span a piecewise-constant space of one dimension, which 
we will write V$ l \ By analogy, if v is one unit on a side, then integer translates of 
v <g> v span a 2D space Vq made up of all the functions that are piecewise-constant 
over integer-sized squares, as shown in Figure 6.26(a). 

As in the ID case, we would like to move to a space Vi (2) , which is piecewise- 
constant over ^/^-integer-sized boxes, as shown in Figure 6.26(b). To move from 
the lower-resolution to the higher-resolution space, we make the Cartesian product 
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FI4IIEI «.26 

(a) The space V 0 (2) . (b) The space V^K 


of V' 0 (2) with the “detail” space Wq 2) , such that 

Vq 2) © = v{ 2 \ wf ] 1 V 0 (2) (6.83) 

That is, the detail space is orthogonal to the low-resolution space, and adds just the 
information we need to get to a better resolution. Eventually, will be able to 
match all 2D functions f(x, y). 

To find Wq 2 \ we again take a cue from the ID case and make the four tensor- 
product functions that come from the four combinations of the scaling function v 
and the first wavelet w°'°. That is, we build v ® v, v 0 w 0,0 , w 0,0 <g> v, and w°'° ® w 0,0 . 
These four basis functions are shown in Figure 6.27. We have given them the labels 
A, H, V, and D, respectively. 





MOURI 6.27 


The four basic functions in the square basis. 


These four functions can match any 2x2 signal. The decomposition can be 
written simply by observing that 

— 9a A + 9hH + 9vV + QdD ( 6 . 84 ) 

That is, an input matrix with elements (a, 6, c, d) is the sum of the four basis functions 
weighted by the coefficients (9A'>9Hj9Vj9d)* To find these coefficients, we can write 


a b 
c d 
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PI6URI 6.29 

Decomposition of the example matrix into the square basis. 


out Equation 6.84 as four linear equations: 

a = 9 a+ 9h + 9v + 9d 
b = 9 a — 9h 4- 9v — 9d 
c = gA+ 9 h - 9v - 9d 

d = 9 a~ 9h - 9v + 9d (6.85) 

So we now have four equations in four unknowns. Some straightforward algebra 
reveals the expressions for the unknowns: 

9a — { a + b + c + d )/2 
9d = (2 9a -b- c)/2 
g v = (a + b-2g A )/2 

9h — a - gy - go ~ 9 a ( 6 . 86 ) 

The transformation of the matrix in Figure 6.25 is shown in Figure 6.28. Note 
that seven of the sixteen coefficients are zero. In this figure, the two-by-two matrix 
associated with 9a represents the 9a coefficients for the four corners of the original 
matrix. That is, to reconstruct the upper-left two-by-two corner of the matrix, 
select the upper-left coefficient from each of the four coefficient matrices, scale its 
associated two-by-two pattern of positive and negative ones, and add these four 
scaled patterns together. Arranging the coefficients into these little matrices is just 
meant as a notational aid; they could be organized in other ways. 
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6.11 Further Reading 

Gilbert Strang has written two fine introductions to wavelets; much of this chapter 
was based on the information they present [422,423], Another recent introductory 
article is the one by Jawerth [229], An excellent survey book with both historical 
and future perspectives has been written by Meyer [302]; this short book serves as a 
fine introduction to the subject. A more complete survey of theory and applications 
may be found in the book edited by Chui [90], A good overview of wavelets and 
their application to signal processing may be found in Rioul and Vetterli [359], We 
only touched on the short-term Fourier transform; a detailed discussion is given in 
Nawab’s essay [316], 

The methods of multiresolution signal decomposition are discussed in more detail 
in the book by Akansu and Haddad [5], where wavelets are integrated into a general 
framework. Extensive, detailed discussions of wavelets may be found in the lectures 
by Daubechies [115]. Both this book and the article by Rioul and Vetterli [359] 
contain rich, recent bibliographies. 

Explicit source code for computing various wavelet transformations (including 
D 4 ) in different computer languages is available in the various second editions of 
Numerical Recipes by Press et al. [348]. C code is also available in the introductory 
article by Cody [94], 

Applications of wavelets to computer graphics are only beginning to appear; but 
they have already been applied to the solution of integral equations (discussed in 
Chapter 16) for rendering. Some good references relevant to this subject include 
a discussion of Galerkin integration by Xu and Shann [493] and the quadrature 
analysis by Sweldens and Piessens [429]. The use of wavelets to compress the 
matrices involved in solving integral equation problems is addressed by Alpert [8], 
and other methods are covered by Beylkin et al. [42] and Alpert et al. [7], The 
application of wavelets to nonuniform sampling is discussed by Feichtenger and 
Grochenig [142,143]. 


6.12 Exercises 

■jcotcIm 6.1 

Show that the “Mexican hat” wavelet 


w(x) = (1 - 2x 2 )e 


has two vanishing moments. 
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Ixm-cU* 6.2 

Carry out the error computation at the start of Section 6.6 using the same signal, 
but use the RMS error metric 


E = 




-|!/2 


Do you get the same order of wavelet inclusion? 

IxotcIm 6.3 

Using the Haar bases, find the wavelet transform for the signal 

{9,11,13, -1, -2,8, -1, -9}. Plot the scaled basis functions. 

berets# 6.4 

Implement the Haar transform, and either implement or find an implementation of 

the Fourier transform. 

(a) Create a 256-sample of a single period of a sine wave, f[n] = sin(x n ), 
n € [0,255], x n = 2n7t/255. Find the Haar and Fourier transforms of this 
signal. Using the compression method of Section 6.6 with an RMS error met¬ 
ric (see Exercise 2), plot the RMS error as a function of the number of wavelet 
coefficients used. Plot the RMS error as a function of the number of Fourier 
coefficients, taking them in order of increasing frequency starting with DC. 

(b) Repeat (a) with the signal f[n] = | sin(x n )|. 

(c) Repeat (a) with the signal f[n] = sin(x n ) -I- exp[-(x n - (7t/3)) 3 ]. 



“ Why would anyone play roulette, ” I think to 
myself, “without wearing a computer in his 
shoe?” 

Thomas A. Bass 
(“The Eudaemonic Pie/" 1985) 
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7.1 Introduction 

In previous chapters we have looked at the Fourier and wavelet transformations 
of signals. These powerful transformations give us the means for discussing what 
happens when we sample and reconstruct a signal. 

Another useful set of tools may be developed by reconsidering the formulation 
of a particular piece of the rendering problem. Consider for the moment creating a 
picture by sampling the image plane, where we ultimately want to find color values 
for each pixel. The theory of signal processing tells us how to design a filter h(x, y) 
to be placed over each pixel, which will determine the contribution of the underlying 
image signal s(x , y) at that location. So the final value p of the sample at some 
particular (x, y) may be written as 

p(x,y) = JJ s(x,y)h(x - p,y - v)dpdv (7.1) 

P 


where P is the pixel area. 
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Once we have used signal processing to design the filter h and set up this integral, 
we are free to evaluate it any way we like. The signal-processing approach uses 
Fourier theory to guide us in taking a number of samples of s and combining them 
with h to reconstruct the continuous-time function p. Given that description of the 
signal, we can filter it to get the final value. But perhaps this is more effort than we 
really need. After all, we only want a final value at the pixel; perhaps reconstructing 
the complete signal is overkill. There may be a better way to evaluate the integral. 

The problem of efficient numerical integration is not new, but it is not nearly as 
old as analytic integration. The techniques collectively called Monte Carlo were de¬ 
veloped in the 1940s to solve complex integrals with the newly developed electronic 
computers. The name comes from the essential role played by random numbers in 
choosing samples of the function used to estimate its integral. A short but inter¬ 
esting history of Monte Carlo methods appears in Chapter 1 of Hammersley and 
Handscomb’s book [183]. 

This chapter will focus on the ideas behind the Monte Carlo method and its 
applications to computer graphics. Monte Carlo methods use many results from 
basic statistics and probability; you should be comfortable with these ideas, which 
are briefly reviewed in Appendix B. 


7.2 Basic Mania Carla Idaas 

Monte Carlo methods are designed to find the parameters of a distribution that 
specifies a random variable . So we begin with some observations (or values) of the 
random variable, and a guess at the form of its distribution. We then try to find the 
parameters of that distribution that match observed values of the random variable. 

As an analogy, suppose that we want to estimate the amplitude of thunderclaps 
during a particular rainstorm. We guess that the volume v of the claps is described 
by a random variable whose values follow an exponential distribution exp[—At; 2 ]. 
Then, by finding the amplitude of several thunderclaps, we try to find the parameter 
A that characterizes that distribution. Notice that each thunderclap is an independent 
random variable 77 * distributed according to the exponential distribution F e (y) (all 
of these terms are discussed in Appendix B). 

The point of this procedure is that if we can characterize the distribution from 
which a random variable is drawn, then we can evaluate a variety of useful measures 
based on that distribution. The integral of the random variable over some domain is 
not the only such measure of interest in rendering, but it will be our driving problem 
in this chapter. 

A parameter estimated by Monte Carlo methods is called an estimand. We find 
a value for the estimand (called an estimate) by working with a number of observed 
random variables, called the sample (or sample set)\ the number of observations is 
called the sample size . Since we have already encountered these words with different 
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meanings in signal processing, we will use the term observation rather than sample, 
and observation set to mean the collection of observations. So the observation set 
size refers to the number of observations we are going to work with to find our 
estimand. 

In graphics we are often concerned with finding the mean /z of a random variable. 
If we have n observations 77 *, we might simply average them together to form a first 
guess for the mean rj: 

n 

rj = Y,Vi/n (7.2) 

»=1 

In fact this approach is provably correct; the mean will eventually approach the 
average [183]. 

We will say that a random variable 77 is normal (ji, o) y for mean /z and standard 
deviation <r, if it fits the normal distribution F n ((t - /z)/cr). Suppose that 771 , 772 ,... 
is a sequence of n independent, identically distributed random variables with mean 
(i and standard deviation < 7 . Their average 

(7.3) 

is asymptotically normal {fi,ojyfn). That is, values for the average fit a normal 
distribution with mean /z and standard deviation cr/y/n. This is important. In other 
words, as we take more samples and produce new values for the estimand, the 
estimated values themselves will tend to be normally distributed. This is true even if 
the random variable itself is not normally distributed! The reason that this is useful 
is because it shows us that the mean of the estimates is the same as the mean of the 
random variable, which is what we are seeking. Thus the estimates themselves lead 
to a value for the mean. 

A slightly more sophisticated approach would form a weighted average by ap¬ 
plying a different weight Wi to each observation and then normalizing the result by 
the sum of the weights: 

n 

^WiTJi 

- ( 7 - 4 ) 

Y, wi 

t=i 

This process leads us to ask if there is some set of weights Wi that results in a more 
accurate value for rj in Equation 7.4 than Equation 7.2 (where, implicitly, each 
Wi = 1 ). Even more broadly, are there other functions of the 77 ^ that are even better 
than Equation 7.4? In this context, “better” means a more accurate estimate of the 
true mean with the same number of samples. Most of the Monte Carlo literature is 
involved in exploring answers to these questions. 
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Some terminology will help the following discussion. Equations 7.2 and 7.4 are 
two instances of a function t(r}i) that takes as input our observations and returns a 
value for the estimand. We call such a function an estimator . In the literature, we see 
phrases like “the estimator t” and “the estimate £” used to distinguish the function 
t from a particular result returned by that function. 

We call the distribution of the random variable we seek the parent distribution . In 
the example given earlier, the parent distribution is the true distribution of amplitudes 
of thunderclaps. Since the underlying value of 77 is a random variable, the values 
returned by the estimator t are also random variables. This is because the values of rj 
are simply mapped through t; the differences between different observations will be 
propagated by £, so it is a random variable as well. If t collapses the observation in 
some dimension (e.g., the function = 0 for all 77 ;), then it may be a constant, but 
we will still consider that to be a random variable in this context. In the Monte Carlo 
literature, the set of estimands produced by t(rji) is called the sampling distribution ; 
as before, to avoid conflict with the signal-processing terminology of sampling, we 
call this the estimand distribution . 

Our goal, then, is to find an estimator that will take observations of the random 
variable 77 and produce an estimate of the parameters of the parent distribution, such 
that the estimand distribution (that is, the collection of estimands) is located near 
the true value of the estimand and is concentrated in a narrow band. 

It turns out that we can find the estimand distribution T(u) in terms of the 
estimator t(y) and the parent distribution F(y); this distribution is given by 

T(u) = P(t( 77 ) < u) = f dF(y) (7.5) 

Jt{y)<u 

Given an F and a £, we are interested in the difference between the result of the 
estimator £( 77 ) and the true value of the parameter we seek (traditionally called 6 ). 
This difference may be expressed by the expected value of the difference between the 
estimate and 6 : 

0 = E[t{rj) - 0) = J 0 t(y ) - 0) dF(y) (7.6) 

where E[a ] is the expected value of a. The value (3 is called the bias of the estimator 
t; it indicates the extent to which the estimate misses the true value of 6. In other 
words, 6 + 0 is the mean of the estimand distribution, rather than the desired value 
6. We can measure the variance of t as 

<7 t 2 = E[(t(V) - E[t( V )}) 2 ) = E[(t - e - 0) 2 } 

= J(t(y)-6-0) 2 dF(y) (7.7) 

The value o t 2 is called the sampling variance of t; in our case we call it the estimand 
variance . The square root of a 2 is the standard deviation of the estimand distribu- 
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tion; for specificity it is usually called the standard error , or in our case the estimand 
error . 

We can use this terminology to make our discussion of estimators more precise. 
We say that t(y) is a “good” estimator if 0 and cr t 2 are both small. This is just a 
restatement of our condition that the estimands be located near 9 and be contained 
in a narrow band. If 0 = 0 , we say that the estimator is unbiased . There is nothing 
inherently bad or undesirable about a biased estimator, as long as we understand 
the bias and can correct for it. If a 2 is smaller than for any other estimator, then 
we say that t is a minimum-variance estimator . 

There are many types of estimators. When you are searching for an estimator, 
sometimes it helps to narrow the field of candidates. A classic approximation is to 
limit the search to linear estimators , that is, linear functions of the observations, as 
in Equation 7.4. The best estimator from this class for a given problem is called the 
minimum-variance linear estimator. 

Classic Monte Carlo methods create an estimand G by combining n identically 
distributed, independent random variables r/*. If each random variable rji is evaluated 
by a function and scaled by a weight A*, then the estimand G may be formed as 
the sum of these weighted functions: 


(7.8) 

i=i 

Since the values rji are random variables, the function values < 7 ,( 77 ,) are as well. Thus 
we can find their mean, or expected value E[G], as 


E[G] = E 




L i=1 


i =1 


(7.9) 


since the expected value operator E is linear, and therefore can be passed through the 
summation. Suppose that all the gi(x) are the same function g(x ), and all A* = 1 /n. 
Then the expected value simplifies to 


E[G] = E 


'-±9M 


1= 1 


t=l 

= E[g(x)) 


(7.10) 
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Equation 7.10 is a crucial result. It tells us that G, the weighted average of our n 
values of g , has the same expected value, or mean, as g itself. So to find the mean of 
the function < 7 , we can find the mean of its weighted samples; with enough samples 
the latter will eventually become the former. 

Given that Equation 7.10 approximates E[g\, we would like to know how quickly 
it arrives at the correct answer. To find this speed of convergence, we find the variance 
of the estimator: 


var(G) = 



= iz i var (3^)) 

»=1 n 

= - var (g(x)) 
n 



(7.11) 


where a p is the standard deviation of the parent distribution. Thus the standard 
deviation <tg of G is 

<t g = ±=o p (7.12) 

y/n 

So the deviation in our estimate is related to the deviation of the parent distri¬ 
bution, and decreases with the square root of the number of samples n. This is a 
fundamental result of classical Monte Carlo. 

Sometimes we wish to find the variance of the parent distribution, rather than its 
mean. An unbiased estimator of the variance of the parent distribution is given by 

s 2 = m 2 + --- + Vn 2 -nrj* 
n — 1 


which has a standard deviation of about 


<7,* « o*/yfi/2 (7.14) 

The estimand error given by Equation 7.12 is exact, but the estimand error of 
Equation 7.14 is only an estimate and depends on the fourth moment of the parent 
distribution (though it is exact when the parent distribution is a normal distribution) 
[183]. Another estimate of the error is given by 

o s « o p /\/2n (7.15) 

This formula for the estimand error is common, but biased. 
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When these formulas are used for actual computation, we usually replace the 
parameter by its value. Thus in Equation 7.12 the value of cr would use s calculated 
by Equation 7.13. 

The biggest problem with Monte Carlo methods in general is the extremely 
slow convergence of the estimand. Equation 7.12 shows that the estimand error is 
inversely proportional to the square of the size of the observation set; in other words, 
the algorithm has convergence 0(l/y/n). Thus to halve the error; we must quadruple 
the number of samples. In general, to reduce the error in n samples by a factor a, 
we must take a 2 (n - 1) more samples. When the desired error is small, n is typically 
very large. In computer graphics each sample is very expensive to evaluate. This 
form of estimator effectively tells us that each successive sample helps us out less and 
less, even though typically they all come at the same enormous cost. Much of the 
research in Monte Carlo methods has been directed at increasing the convergence of 
the estimator, or getting better results from fewer samples. 


7.3 Confidence 

It is often important that we have some measure of the confidence in a Monte 
Carlo estimate. That is, when we have generated a particular estimate based on 
some number of samples, can we express quantitatively how certain we are that 
our estimate is correct? Such a measure would allow us to make statements of the 
form, “We are 85% confident that the average value is within 5% of the estimate.” 
Sometimes confidence is straightforward to determine, but more often it is very 
difficult. It usually requires that we make some a priori determination of the form 
of the parent distribution. 

For example, let us suppose that the parent distribution is normal. We will now 
develop a tool that allows us to make meaningful statements about the quality of 
our estimate of the mean p of the parent distribution. The interesting thing about 
this result will be that it does not involve estimating <r p , so our confidence for one 
parameter is not based on a good estimate for another, which would be a troubling 
situation. 

We begin by defining the alpha measures on a set of observations of a random 
variable. We define these for a fixed observation set size n and for each nonnegative 
integer k : 

a * = iX> fc < 7 - 16 > 

n ti 

In particular the first alpha measure is the mean: a\ = rj. Using these quantities, we 
can define Student’s t distribution . Consider the two mutually independent random 
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variables 


Vti = (t*i -m)y/n 

Vt2 = — ~—r{oi2 — «i 2 ) (7.17) 

n — 1 


We can form a ratio £ called Student’s ratio as 

V^2 V«2-«l 


(7.18) 


Since it is a function of random variables, t itself is also a random variable. It has a 
density function 5 n _i(x) given by the somewhat awkward formula 


where the Gamma function T(t) is defined by 


— n/2 


poo 

T{t)= / ** _1 « 

Jo 


dx 


(7.19) 


(7.20) 


The distribution 5 n _i(x) is called Student’s t distribution with (n — 1) degrees of 
freedom . We can write the variance of the observation set s 2 as 




(7.21) 


We are now ready to create the confidence test. In the last paragraph we defined 
the random variable t to have a distribution given by 5 n _i(x). Since the distribution 
tells us the cumulative likelihood that a random variable will take on a value less 
than or equal to a given value, we can find the chance that the variable will land in 
some range as an integral of the distribution function throughout that range. So the 
likelihood that t is between two reals a and b may be found from 

P(a<t<b)= f S n -i(x)dx (7.22) 

J a 

Some algebra allows us to rewrite this as 

P (ax - ^L. < m < qi - = f S„-x(x)dx (7.23) 

\ Vn— 1 yjn-lj J a 

Equation 7.23 is the tool I promised earlier. Given n samples and two limits a 
and 6, the probability that the true mean is within the interval [a, b] may be found 
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explicitly as the integral over a piece of 5„_i(:r), which does not require estimating 
cr p . Typically these integrals are precomputed and stored in a look-up table, indexed 
by a, &, n, and a\ = rj. 

This analysis breaks down if the parent distribution is not normal; the less normal 
the distribution, the less accurate our estimate will be. It is difficult to find a 
completely accurate test for normality of a distribution, though several partial tests 
such as Shapiro and Silverman’s [394] have been developed; they are surveyed in 
Spanier and Gelbard’s book [415]. 


7.4 Blind Monte Carlo 

We may distinguish two approaches to improving our estimate: those that do not 
require any a priori information about the signal (which I call blind Monte Carlo), 
and those that use some knowledge of the function being integrated (which I call 
informed Monte Carlo). We begin with blind techniques in this chapter. 

We will summarize five types of blind Monte Carlo methods: 

■ crude Monte Carlo 

■ rejection Monte Carlo 

■ blind stratified sampling 

■ quasi Monte Carlo 

■ weighted Monte Carlo 


7.4.1 CtimU Mont* Carlo 

Crude Monte Carlo (or basic Monte Carlo) is the approach that we discussed in 
detail in the previous section. For completeness, we briefly repeat the estimator and 
its error values here. In computer graphics, we are usually interested in the mean (or 
average) value 9 of a signal f(x) over some domain, which without loss of generality 
we take as [0,1]: 

9 = [ f(x)dx (7.24) 

Jo 

If we generate n independent, uniformly distributed random variables & e [0,1], 
then from the quantities /* = /(&) we can find an unbiased estimator / 



(7.25) 
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As shown in the previous section, this estimator has a variance 

(Tj 2 = ~ [ ( f{x ) — 0 ) 2 dx = a 2 /n 
1 n Jo 


(7.26) 


where a 2 is the variance of /, the parent distribution. Its estimand error is given by 

c tj = a p /y/n (7.27) 

Equation 7.25 is usually referred to as the crude Monte Carlo estimator of 6 . 

Although the 0(1/ y/n) convergence of crude Monte Carlo is slow with respect 
to the number of samples, it isn’t as bad as rejection Monte Carlo. 


7.4.2 Rejection Monte Cerle 

The rejection (or hit-or-miss) technique is worth discussing because it used to be the 
one that was most strongly recommended for Monte Carlo problems [414]. In fact, 
it should be avoided whenever possible, as we will see. 

The idea is known as rejection because it is based on creating a number of samples 
and rejecting those that don’t meet a certain condition. Crude Monte Carlo evaluates 
all the samples it generates, but rejection Monte Carlo requires us to create a sample, 
test it to see if we really do want to evaluate it, and then proceed only if the test 
succeeds. This can become very expensive if the cost of either generating or testing 
samples is high, or if there is a very low likelihood of success. In the latter case we 
end up spending most of our time generating and disposing of samples, rather than 
evaluating samples and building up an estimate of the integral. 

Suppose that we have a function f(x) that is bounded by [0,1] when 0 < x < 1. 
If we look at y = f(x ), then in the interval [0,1] the curve is entirely bounded by the 
unit square. We seek the mean 0, which is simply the percentage of the square’s area 
lying under the curve. We can write this as 


where 


/(*) = 

[ 9 (x,y)dy 

Jo 

(7.28) 

s(*»y) = j 

' 0 if f(x) < y 
. 1 if f(x) > y 

(7.29) 


Then 0 is the area under the curve, given by a double integral: 

0 = / g(x,y)dxdy 

Jo Jo 


(7.30) 
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We can imagine finding an estimate for 6 with a thought experiment. Imagine 
throwing n darts at random into the unit square, and then counting the number 
that land below y = f(x). Recall that the binomial distribution F e (y) gives us the 
number of successful events out of n trials, when the probability of success of each 
one is 0 < p < 1. Thus this dart-throwing approach is sampling from the binomial 
distribution with p = 0 (the chance of success on each throw is the ratio of the area 
under the curve to the total area of the square). The binomial distribution’s variance 
can be shown, according to Hammersley and Handscomb [183], to be 

&b 2 = 0(1 — 0)/n (7.31) 


From above, we have the variance aj 2 for crude Monte Carlo: 

crj 2 = — f {f(x) — 0 ) 2 dx = — [ f 2 dx- 6 2 /n 

3 n Jo n Jo 


(7.32) 


Comparing the two, we find the amount of error due to rejection beyond that due 
to crude Monte Carlo: 




2 e i r 1 /2 . 

- cry =-/ f dx 

3 n n J o 

= - f /(I - f)dx > 0 

n Jo 


(7.33) 


Thus the error due to rejection methods is always worse than that of crude Monte 
Carlo. The use of rejection instead of crude Monte Carlo gave the entire field of 
Monte Carlo a bad reputation for many years; the convergence is so poor that the 
technique is often excruciatingly slow. The improvement in crude Monte Carlo is 
the avoidance of an unnecessary step in the calculation: it replaces the 2D function 
g(x, y) by its ID expectation f(x). This makes a significant change to the convergence 
properties of the method. 

This comparison points out an important rule of thumb that is worth keeping 
in mind for all Monte Carlo work. Hammersley and Handscomb [183] phrase this 
principle simply and directly in their book: “If, at any point of a Monte Carlo 
calculation, we can replace an estimate by an exact value, we shall reduce the 
sampling error in the final result.” 

This is a basic principle that we should apply whenever possible to improve 
Monte Carlo sampling of all types. 


7.4.3 Blind S trat ifi e d Sampling 

Suppose that we are executing a Monte Carlo algorithm and picking random po¬ 
sitions to generate observations. We know that eventually the samples will follow 
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a particular distribution due to the process used to generate them, but there’s no 
guarantee that any specific group of sequentially generated samples will have that 
distribution. For example, the first n samples generated might all land in almost the 
same spot in the domain. We would like very much to avoid such clumping . Our 
signal inhabits the entire domain, so our samples should as well. When samples are 
precious, as they are in computer graphics, we would usually like them to sample 
the domain as efficiently as possible right from the start. 

One way to accomplish this is called stratified sampling . The basic idea is to break 
up the domain of the integrand into regions, or strata (singular stratum ), where each 
region represents an equal amount of information. When we know nothing of the 
underlying function, we call this approach blind stratified sampling because we have 
to guess at the best subdivision of the domain without knowing anything about the 
function. For example, suppose we divide the ID domain [0,1] into k equal regions 
with intervals (ak-i ,<**), where c*o = 0 and a\ = 1. If we decide beforehand to take 
rij samples in domain j, then we can write an unbiased estimator 


* = ££ Qj + (atj - aj-ite) 

i =i i =i 




with variance 


°t 




(7.35) 


This variance can often be better than for crude Monte Carlo. 

The efficiency of stratified sampling increases proportionally to the square of the 
number of strata [183]. This means a small increase in the number of strata can 
have a large effect on the quality of our sampling. 

One problem with this approach is that different domains have different distribu¬ 
tions of information; a region that is important to one function might be irrelevant 
to another. Thus we would like to subdivide our domains using as much knowledge 
as possible about the underlying function. Therefore we will discuss a variation on 
this technique under the section on informed Monte Carlo. 


7.4*4 Qoasi Moot# Carlo 

In blind stratified sampling, we tried to distribute our samples in such a way that they 
hit all the important parts of the domain in a roughly uniform but nonperiodic way. 
We did this by breaking up the domain into pieces, and then randomly sampling 
each piece. An alternative is to directly generate a sequence of samples that have the 
same characteristics. 
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This is the approach taken by quasi Monte Carlo , which is also known as number - 
theoretic Monte Carlo [502]. This approach uses no random numbers, but instead 
employs techniques from number theory to generate a set of sample points that are 
roughly uniform but aperiodic throughout the domain. 

Several sequences based on number theory are summarized by Warnock [471]. 
The Halton sequence is defined for A-dimensional points Xi. Suppose that we are 
generating an N- dimensional point x m ; it is defined by 

Xm = (<t >2 (m),(/> 3 (m),...,</> PN _ 1 (m), 0 p N (m)) (7.36) 

where Pi refers to the ith prime number (pi = 2,/>2 = 3 ,p 3 = 5,...) and the 
function <j) r (m) is the radical-inverse function of m to the base r. The value of 
0 r (m) is obtained by writing m in base r and then reflecting the digits around the 
decimal point. For example, using the subscript to indicate base, 26io is IIOIO 2 , and 
reflecting that gives O.OIOII 2 = 11/32. In base 3, 19io = 2 OI 3 , and reflecting that 
gives 0.1023 = 11/27. Writing <j> r (m) symbolically for a number m: 

m = a 0 r° + a\r l H-b a^r 1 H- 

<j> r {m) = a 0 r~ l + a\r~ 2 H-b H- (7.37) 

We might ask how well this pattern distributes samples over the image plane. 
One method for characterizing patterns is to measure their discrepancy [471], which 
is a single number. Small discrepancies correspond to evenly distributed (or equidis- 
tributed) patterns, and large discrepancies correspond to patterns that are unevenly 
distributed (which causes visible effects like clumping and large sparse regions). 
Warnock notes that if this first point, xo , is placed at ( 1 , 1 ,..., 1 ) then the discrep¬ 
ancy is usually lower than if this point is placed at the origin. 

The Hammersley sequence is very similar to the Halton sequence; it is defined by 

x m = (m/AT,^ 2 (m),(/> 3 (m),..., 0 p N _ 2 (m),^p N _ 1 (m)) (7.38) 

where p n is the nth prime number, starting with p\ = 1 (so P 7 = 13). 

The Zaremba sequence builds on these ideas. It is defined in terms of a folded 
radical-inverse function Vv(m), similar to </> r (ra). Again writing m in its expansion 
base r, as in Equation 7.37, we define 

Vv(l7t) = (&0 “b0) mo dr^ _ 1 H" ( a l “b l)modr^ 2 H-h (df + t) mo dr^ - (7.39) 

The difference between ^ r (m) and </> r (m) is the addition of the positional index of 
the digit to its value, and then taking the result mod r at each location. For example, 
when m = 26io and r = 2 , the reversed form of m (from above) is O.OIOII 2 , so we 
add the index and take the sum modulo 2 at each digit: 

.0 1 0 1 1 0 0 0 0 0 

+ 0123456789 

0224556789 
0 0 0 0 1 1 0 1 0 1 


mod 2 


(7.40) 
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which we may write as ^(26) = 0.00001 IOI 2 , where the overline indicates an 
infinitely repeated sequence of digits. Our other example, m = 19 and r = 3, is 
similar; the reflected digits are 0.102 3 , and the addition and modulo operations are 

.1020000000 
+ 0123456789 

= 1143456789 

mod 3 1110 12 0 12 0 

so ^3(19) = 0.1110l2 3 . 

Each N-dimensional point x m generated by the Zaremba sequence is given by 

x m = (^2(m),^ 3 (m),...,t/;p N _ 1 (m),'0p N (m)) (7.42) 

Warnock also gives some advice on how to compute the discrepancy in practice, 
and provides explicit algorithms that are tuned for efficiency for calculating the 
discrepancy of various sequences. The estimand error for quasi Monte Carlo is, for 
constants b and fc, 

a q = kyj (\ogn) b /n (7.43) 


7.4.5 Weighted Monto Carlo 

Recall the nonuniform weighting method of Equation 7.4: 

n 

YjViVi 

e = - (7.44) 

J2 W * 

i—1 

In crude Monte Carlo, all Wi = 1. In this section we describe weighted Monte Carlo , 
which is a method for selecting the Wi to improve the convergence of the estimator. 

Weighted Monte Carlo is based on using a reconstruction rule of higher order 
than the simple average of crude Monte Carlo. The method of weighted Monte 
Carlo was first described by Yakowitz et al. [494], The basic idea parallels the 
development of integration in most calculus texts. 

Integration is usually introduced by the use of an increasingly dense set of rectan¬ 
gles to estimate the area under a curve. The total area of these rectangles is known as 
a Riemann sum approximation to the integral. Crude Monte Carlo in ID effectively 
makes that approximation, assuming that each rectangle has equal width: the area 
is approximated by n rectangles with height f(x{) and uniform width 1/n. If the 
samples are not uniformly spaced, then this assumption doesn’t match the reality, as 
Figure 7.1(a) illustrates. 



3 




FI O U RI 7.1 

(a) Crude Monte Carlo assumes all boxes have equal width, (b) A better estimate is to use the 
correct width for each box. 


In fact, each box has a width that is determined by the locations of its nearest 
neighbors (or the limits of the domain for samples on the end). A better approxi¬ 
mation would weight each sample value by the area of this rectangle, as shown in 
Figure 7.1(b). We can continue following our analogy and move from rectangles to 
trapezoids. The associated estimator for the domain [0,1] then becomes 

1 f n 

0n = T Xif{x 0 ) + y^(x i+ i - Xi -l)f{Xi) + (1 - x n )f(x n+ i) 

L i= 1 J 

-Xi)if(Xi) + f(x i+ i)) 



(7.45) 
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MOURI 7.2 

(a) f(x) = x 2 /(x 2 -f 1) 3/2 . (b) Error in f(x) as a function of number of samples n. Crude Monte 
Carlo is the thin line, weighted is heavy, (c) g(x) = e - 5 (*- 2 75 ) 2 + e -3o(x-o.75) 2 (j) £ rror j n 

g(x) as a function of number of samples n. 


Yakowitz et al. [494] have shown that if / has a continuous second derivative, 
then for some constant M, this estimator has an estimand error given by 

o w < M/n 2 (7.46) 

This is far better than the a p /y/n estimand error of crude Monte Carlo. To illustrate 
the convergence properties of crude and weighted Monte Carlo, Figure 7.2 shows 
the error from each method for two functions. One is a polynomial, the other a sum 
of two Gaussians, intended to approximate a pair of finite light sources illuminating 
a point on a surface. 

So far we have discussed only ID integration. Everything we have said in fact 
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holds for one, two, and three dimensions, though the quality of improvement dimin¬ 
ishes with increasing dimensionality [493]. 


In dimensions greater than 1, there are two general approaches for weighted Monte 
Carlo estimation [494], 

The first is a nearest-neighbor approach, which generalizes the idea of the rect¬ 
angular rule. The idea is to partition the domain into as many cells as there are 
samples. The cells tile the domain without gaps or overlaps, and each point in each 
cell is closer to the sample point associated with that cell than with any other sam¬ 
ple point. In two dimensions, this is the Voronoi diagram induced by the sample 
points, illustrated in Figure 7.3(a). If we assume that each cell has a constant height 
given by the value of its associated sample, then we get a signal such as the one in 
Figure 7.3(b). 

If we have n sample points x *, then we also have n cells c*, each with volume 
V(ci) (in 2D, this is the area of the cell). Then we can find the mean by weighting 
each sample by the volume of its associated cell: 

n 

M = I>(c<)/(*i) (7.47) 

i=0 

The estimand error in this estimator for n samples in a d-dimensional space can be 
shown to be 

a t = 0 (l/n 2 / d ) (7.48) 

When d = 2, this convergence is 

a t = 0(l/n) (7.49) 

which is far better than the 0(1/ y/n) convergence of crude Monte Carlo. Surpris¬ 
ingly, in four dimensions, when d = 4, this estimand is no better than crude Monte 
Carlo, and for dimensions d > 4, the convergence for the nearest-neighbor rule is 
slower than crude Monte Carlo [493]. 

The other approach is the trapezoid approach, which generalizes the ID trapezoid 
algorithm used for integration. This is a rather different method than the nearest- 
neighbor algorithm just discussed, since it requires taking additional samples of the 
function in order to evaluate the estimator. 

We will first look at the algorithm in 2D. Suppose there are three samples in the 
unit square, as shown in Figure 7.4(a). If we draw a vertical and horizontal line 
through each sample point, then we induce a set of sixteen rectangles, as shown in 
Figure 7.4(b). 
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(a) The Voronoi diagram for a few sample points in the unit square, (b) The piecewise-constant 
signal based on (a). 


Suppose that in addition to the three samples we know, we evaluate the function 
at the other twenty-two grid intersections. This gives us a value at each intersection, 
as shown in Figure 7.5(a). Recall that the area of a trapezoid is its width times the 
average of its heights. In 2D, the volume of the cell is approximated as the area of 
its base times the average of its heights. Note that this isn’t the same as passing a 
plane over the four corners of the cell, since in general no plane can interpolate four 
arbitrary heights. The approximation instead places a horizontal plane at the average 
height, as shown in Figure 7.5(b). So, although the trapezoid rule is continuous in 
one dimension, in general it will not be so in higher dimensions. 

The estimate of the mean is found by summing the volumes of each of these 
rectangular prisms. 

In general, suppose that we are interested in the mean value 0 of a function 
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FIOIIRI 7.5 

(a) Function values at grid intersections, (b) Constant approximations in each rectangle. 
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/ in a d-dimensional hypercube fi, which has limits [0,1] in each dimension. We 
begin with a set of n sample points x*, where each x* is a d-dimensional vector; we 
write Xj = {x*,i,.. M Xj,d}. To set up the indexing, we will gather together the 6th 
component of each x*, sort the result, and call it a*>. We use these values to create 
a new list of n vectors y*. yi ^ contains the first element of a^; that is, it is the 
vector of the smallest values of the x* vectors in each dimension. y 2 is the vector of 
second-smallest values, and so on. Formally, we write 

a d = sort< |Jx i)d 

^ i 

y i = {ai,an, i} (7.50) 

We also create two bounding vectors, yo = 0 and y n +i = 1. To build up the sum, 
we walk through the list by visiting each point in [0,1] and finding the area of the 
parallelepiped for which that point is the corner closest to the origin. The volume 
of the cell is found by multiplying together the lengths of each of its sides. We then 
scale that volume by the average value of the function evaluated at each corner. 

We enumerate the corners by creating a list 5 of the coordinates of each corner. 
This is found by taking the corner under consideration, and finding the coordinates 
when that point is pushed one step in each combination of directions. We then 
evaluate the function at all these points and divide the result by the number of points 
involved. 

Formally, for d dimensions and n starting samples, 

*=E---E{n<y‘.+‘j-> r M)xi E /(»')) < 7S1 » 

fci = l k d =l'j=l VeS(k lt ... 9 k d ) } 

where 5 is the set of points given by 

• • • , kd) u - u (y&i+si,i’ * * * ’ ykd+sd,d) (7*52) 

si=0,l Sd=0,l 

To illustrate in 2D (so d = 2), we write down the Xj for the n = 3 points in 
Figure 7.3(a), and derive the corresponding y$ vectors. We then show the points 
involved in evaluating a sample rectangle. 

xi = (.3, .7), x 2 = (.8, .9), x 3 = (.6, .4) (7.53) 


so 

yi = (.3, .4), y 2 = (.6, .7), y 3 = (.8, .9) (7.54) 

After six rectangles have been evaluated, k\ = 2 and k n = 3. Then the function S 
gives the set of points 

5(2,3) = {(y 2 ,i, yi, 2 ), (y 2 ,n y^), (y3,i> yi, 2 ), (y3,u y 2 , 2 )} 

= {(.6, .4), (.6, .7), (.8, .4), (.8, .7)} 


(7.55) 
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The expression for the convergence of this approach is the same as that of the 
nearest-neighbor method given in Equation 7.48, except that now n refers to the total 
number of samples taken, rather than just the starting set. If there are n samples in 
the original set of d-dimensional samples x*, then we need a total of N = (n + 2) d 
samples, requiring (n + 2) d — n new samples to be evaluated. When n^> d, this cost 
is significant. For example, in d = 2 dimensions, if we start with n = 12 samples, we 
need to take 132 more to evaluate the trapezoidal estimator. 

When samples are expensive, as in computer graphics, the increased speed of 
convergence may be more than offset by the increased cost of estimating each sample. 
In other words, this technique may produce a much better result for 144 samples 
than some other method, but that result may be far more precise than we require; a 
cruder technique that gives an acceptable answer after a smaller number of samples 
may be preferable. The basic problem is that the number of samples required by this 
method does not increase in small increments but in huge jumps, so we don’t have 
the option to stop as soon as our estimate has enough precision. 


7.5 Informed Monte Carlo 

Blind Monte Carlo techniques are based on trying to find good estimates for signals 
about which we know nothing. If we do know something about the signal, then we 
should exploit that information to the fullest in order to save time and computational 
expense. We call methods that use knowledge about the signal to guide the sampling 
informed Monte Carlo . 

Each informed Monte Carlo method exploits some knowledge or estimate of the 
underlying integrand f(x) to direct the placement of sample points. 

We will summarize four important informed Monte Carlo methods: 

■ informed stratified sampling 

■ importance sampling 

■ control variates 

■ antithetic variates 


7.5.1 Informed Stratified Sampling 

We saw earlier that blind stratified sampling was a technique for subdividing the 
domain so that even a small number of samples would be roughly uniformly scattered 
over the domain. 

This is an advantage over simpler methods that might produce clumps of samples, 
but with knowledge of the function we can do even better. Suppose we stratify the 
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(a) A function and a set of uniformly sized strata, (b) The same function and a set of nonuniformly 
sized strata. 


domain of a function as in Figure 7.6(a). This isn’t the best stratification: some 
very simple pieces of the function are represented by many strata, while complex 
regions of the domain get only one stratum. A better subdivision is shown in 
Figure 7.6(b), where each sample has a chance to report about the same amount of 
useful information. 

There are two general approaches to building the stratification. Suppose that 
each stratum i has variance Vi and mean /z*. Then one good strategy is to subdivide 
so that each Vi is less than max(^i — 7 ^ A better approach is to choose the 

so that the Vi are as uniform as possible [183]. 

When we have access to the underlying function /, then the sample points should 
be chosen so that the relative proportions of the n^, the number of samples in domain 
fc, is proportional to the difference [183]: 


rotk 

n k oc (at - a k - 1) / f 2 (x) dx - 

J OLk-\ 


\f 


f(x) dx 


(7.56) 


7 . 3.2 Impo r tance Sampling 

Importance sampling is a powerful general method for reducing the variance in 
many Monte Carlo calculations [239]. In the ideal situation, importance sam¬ 
pling can eliminate variance altogether through the use of zero-variance estimation . 
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Zero-variance methods are Monte Carlo techniques, but they are designed to yield 
provably consistent and exact estimates with no statistical variation. 

At the heart of importance sampling is the following somewhat specialized ob¬ 
servation. Suppose we want the definite integral of a product of real functions over 
some interval, and we know one of the functions analytically or numerically. Then 
we can use that information to guide our sampling of the product, and get a good 
answer more quickly than if we did not know one of the functions. In symbols, we 
have two real-valued functions f:H n h 4 72. and g: 72 n 72, and we want to find the 

integral of their product over some n-dimensional domain fi: 



Some important problems in rendering may be expressed as finding integrals of this 
form (for example, / could be a filter over a pixel and g an image function, so that 
/ is the color value of a pixel). 

We will see that the idea of importance sampling is that some regions of the 
function will contribute more to the final estimate than other regions. These are 
typically places where the function has a large value, or varies in value significantly 
and quickly. We say that these regions have more “importance” than others. The 
goal will be to sample these regions more densely to get a better idea of what’s 
happening. But we need to compensate for the nonuniform sampling so that we 
don’t bias our final answer. 

We can develop importance sampling from some basic ideas. Suppose that we 
want to find the integral of a ID function g(x) over some interval T: 

G = J g(x)dx (7.58) 

We can draw uniformly distributed samples rji from the interval T and compute an 
estimate of the integral by summing the values: 


n 


G = ^ff(xi) 
i 


(7.59) 


We can write this operation in another way that will open up some new possibilities. 
We write the integral as the product of g with another function /, which we call 
the importance function. The function / is the probability density function for the 
samples. We can write the integral above as 

G = J g(x)f u (x)dx (7.60) 

where / u (x) = 1 . To estimate Equation 7.60, we draw random variables 77 , from the 
density function defined by / (that is, 77 ~ /), and evaluate < 7 ( 77 ,), summing them as 
in Equation 7.59. 
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MOURI 7.7 

(a) A function g(x). (b) An importance function f(x) for g. (c) Samples chosen from the density 
function /. 


Suppose we choose a function / that is large where g is “interesting” (as defined 
above), and small otherwise, as in Figure 7.7. Then when we draw our samples 
from /, we will get more samples in the important regions of #, and fewer in the less 
interesting regions, as shown in Figure 7.7(c). 

This is a good idea, but now we’re no longer integrating g alone; we’re getting 
the product of g and /. Another way to think of this is that as we draw samples 
from (j, the weight we attach to those samples is given by the corresponding value 
of /. Where / is large, we weight the sample by a large value, since we consider it 
important. So rather than integrating g(x ), we’re integrating the product g(x)f(x). 
Since our interest is in integrating g to find G, we can compensate by dividing through 
by /: 


-! 

f fel 
J U(x) 


9 {x)f(x) 

fix) 

9(x) 

fix) 


dx 


f(x) dx 


(7.61) 
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MOUKI 1.5 

Photomicrographs of the human retina at different distances from the center. 
The large cells are cones and the small ones are rods. The photos are each 
about 44 pm in width, (a) 1.35 mm from the center of the retina, (b) 5 mm 
from the center of the retina, (c) 8 mm from the center of the retina. 

(Courtesy of Christine Cur do.) 
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FI+URI l.f 

Photoreceptor density in the human retina, (a) Cones in the entire 
retina; circles are 5.94 mm apart. The black oval is the optic disk, 
(b) Cones in the fovea; circles are 0.4 mm apart, (c) Rods for the 
entire retina; circle spacing as in (a), (d) Rods near the fovea; circle 
spacing as in (b). (e) Rods immediately around the fovea; circles are 
0.2 mm apart. (Courtesy of Christine Curcio.) 




FIOURI 2.6 

(a) Color interpolation in RGB space, (b) The same colors in XYZ space, 
(c) The same colors in L*u*v* space. 






PltUII 3.32 

A printer gamut and a monitor gamut on a chromaticity diagram. Reprinted, 
by permission, from Stone et al. in ACM Transactions on Graphics , fig. 8, p. 265. 




PltURI 10.90 

Reconstruction filter phenomena. 

(a) Ringing, (b) Sample-frequency ripple, 
(c) Anisotropy, (d) Filter blurring. 

(e) Reconstruction error. Reprinted, by 
permission, from Mitchell and Netravali 
in Computer Graphics (Proc. Siggraph 
’Sty, figs. 4, 6, 8, 9,11, pp. 226-228. 



(b) 






















































ntuRi 10.101 

(a) The test situation: a straight edge between black and white regions, (b) A failure of 
weighted-average reconstruction. Reprinted, by permission, from Mitchell in Computer 
Graphics (Proc . Siggraph ’87), fig. 11, p. 72. 



HOUR! 10.103 

Reconstruction with the Mitchell multistage filter. 
Reprinted, by permission, from Mitchell in Computer 
Graphics (Proc. Siggraph ’87), fig. 14, p. 72. 
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There are three conditions we need to satisfy to apply Equation 7.61 [239]: 

1 f(x) > 0 

* 

3 ~rr~\ < oo except at a countable number of points 

/(*) 

The variance of Equation 7.61, as described in Kalos and Whitlock [239], is 

= < 7 - 62 > 

Since G 2 is the variance of the original signal, we would like the integral to be 
as small as possible. If we could get the integral down to zero, then we would 
be introducing no new variance into the estimate at all, which would be the best 
situation. We would like to find the best / to minimize this integral. It may be 
tempting to choose an / that is large, causing the denominator to drive the fraction 
to zero, but the three conditions mentioned above must still be satisfied. 

An alternative is to use Lagrange multipliers [239]. Here we try to pick a scalar 
A to minimize 

L(f) = j dx + X J /(*) dx (7.63) 

To find the minimum of this function, we differentiate and set the result to zero: 

' >= S}[Im dx+x I mdx . 

=~m +x ,7 • ^4, 

Solving for /(#), we find 

f(x) = X\g(x)\ (7.65) 

So the ideal f(x) is some multiple of the absolute value of g(x). An example is 
shown in Figure 7.8. 

We now need to find the value of A. Recall condition 2 above on /, which stated 
that f f(x) dx = 1 . If g(x) > 0, then f(x) = A g(x), so 



(7.66) 
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MRURI 7*8 

(a) g(x). (b) f(x) = |p(ar)|. 


where the last step used the definition of G from Equation 7.58. We find then that 
A = 1/G. So we can construct f(x) from 


/(x) = 


g(*) 

G 


(7.67) 


Now that we have defined /(x), we would like to know how good it is as a 
function to direct our sampling strategy. Let’s proceed by drawing samples from 
g{x)/f(x) using the density function /(x), and forming the integral G„ after the first 
n samples: 


_ J_ 9( x i) 

n f(xi) 

_ J_ v'' 9(xj) 

~ 9{Xi)/G 
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1 N 


N 
= G 


i=l 


(7.68) 


This derivation shows us that this choice of f(x) gives us a perfect density function 
for use with this g(x); after any n samples, the estimated integral is identical to the 
true result. 

There is a drawback to this scheme, and that is that in order to build f(x)> we 
need to already know G, which is precisely what we are seeking. Of course, if we 
already knew G, then we wouldn’t bother with Monte Carlo at all, so it may appear 
that this approach is useless. But we don’t need to use exactly the right f{x); any 
function f(x) “close” to f(x) will reduce the variance because it will cluster samples 
more densely where they will matter most in the final result. 

The correct use of “close” in this regard is difficult to quantify. Shirley has 
presented some results [402] that show that when f(x) departs too far from the 
ideal /(x), the variance of the estimate can actually increase dramatically, giving us 
much more work to do than if we simply drew samples uniformly. However, when 
the function is chosen carefully, the variance can also be decreased by a dramatic 
amount [239]. _ 

It’s important when choosing f(x) that it satisfy the three constraints listed earlier. 
An analytic compliance with these constraints is best, but a numerical integration 
of sufficient accuracy, followed by a normalization phase, will allow any bounded 
function to be used for /(x). 


7.5.3 Control Voriotos 

The method of control variates is similar to importance sampling [183]. 

The basic idea of control variates is to break the integral into the sum of two new 
integrals, one of which can be handled theoretically. If the new function is positively 
correlated with the function we started with, then it will tend to produce values that 
correspond with those we evaluate numerically. 

So we take our unknown function / and combine it with a known, analytically 
integrable function <£. In symbols, 

0 = [ f(x) dx 
Jo 

= f <j>(x) dx + [ [f{x) — <j>(x)\ dx (7.69) 

Jo Jo 

The function </>(x) is called the control variate for /. We have broken the integral 
into two parts, one analytic and free from error, the other numerical with an associ- 
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ated error. Since the total contribution of the numerical integral to the final estimate 
is reduced, the amount of error contributed by that integral is also reduced. The 
more substantial we can make the contribution of 0 (#), the less error we will have 
in the result. 

Thus we have contradicting demands on <f>: it must be simple enough to integrate 
analytically, yet complicated enough to do a good job of matching /. Each new 
application requires a new weighing of these conflicting demands, and the associated 
search for 4>. 

The variance for this method [183] may be shown to be 

var(t) + var(^) — 2 cov(£, t') (7.70) 

The method of control variates is a good demonstration of the power we get from 
the principle of replacing approximate operations with exact ones. 


The method of antithetic variates takes an approach opposite to that of control 
variates. Rather than augment our samples with values from a positively correlated 
function, we combine our estimator t with a second, negatively correlated estimator 
t '. We choose this estimator so that it has the same expectation as £, though we 
don’t know the actual expected value for either function. Then both estimators are 
unbiased, and because they are negatively correlated, their average 

0 a = {t + t')/2 (7.71) 

will be an unbiased estimator for 9 , with variance, according to Hammersley and 
Handscomb [183], 


a a = i var(t) + ^ vax(t') + ^ cov(£, t') (7.72) 

where cov(£, t') is negative. This variance can sometimes be made smaller than that 
for crude Monte Carlo for an appropriate estimator t'. 

For example, suppose that £ is a uniformly distributed random variable. Then 
1 — ^ is also uniformly distributed. Thus /(£) and /(I — £) are both unbiased 
estimators of 0 . If / is monotonically increasing or decreasing in an interval T, then 
t = /(£) and t f = /(I — £) will be negatively correlated within T, so we can find 6 a 
from 

0a = \(t + t r ) = \m + \f( 1-0 (7.73) 

Antithetic variates are usually easier to find than control variates, so they are 
more common in practice [183]. 
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7.6 Adaptive Sampling 

Adaptive sampling attempts to bridge the gap between informed and blind Monte 
Carlo methods. It begins with a blind sampling of the domain, and from that 
information tries to guess something about the nature of the function being sampled. 
That guess is used to create a function that is then used to guide one of the informed 
sampling techniques above. 

One of the most common applications of adaptive sampling in computer graphics 
begins with a generic stratification of the domain, which is used to drive a blind 
Monte Carlo algorithm. This sampling typically continues until a predetermined 
number of samples have been drawn. Those samples are then examined and a guess 
g is constructed to match the underlying function /. This g is then used to drive 
an importance sampling routine, which may update g periodically to improve the 
estimate and speed of convergence. 

We will revisit this idea in much more detail in the following chapters. 


7.7 Ollier Approaches 


As we have seen above, there are two main approaches to improving the efficiency 
of Monte Carlo: finding a better estimator, and more carefully selecting where to 
place our samples. These different strategies may be applied in a variety of ways, 
depending on whether or not we know something about the underlying function. 

We have not listed all the efficiency methods developed since Monte Carlo was 
introduced in the 1940s. Further information on other approaches may be found in 
the references in the Further Reading section, particularly in the work of Hammersley 
and Handscomb [183] and Spanier and Gelbard [415]. 


7.8 Summary 


Figure 7.9 gives a summary of the nine methods for Monte Carlo estimation discussed 
in this chapter. 

The efficiency of these techniques varies widely, depending on the functions in¬ 
volved. For informed strategies, the quality of the auxiliary function can make a 
great difference in accelerating convergence. Hammersley and Handscomb [183] 
have evaluated a single test function consistently with a variety of methods. In rough 
terms, they found what we would expea: informed techniques were superior to 
blind methods, and the more knowledge that could be applied in the form of a good 
auxiliary function, the faster the technique converged. 

Reasonable arguments may be found in Kalos and Whitlock [239] advocating 
importance sampling over stratified sampling, and vice versa in Shirley [402]. These 
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Name 

Estimator 

Estimand error 

Crude MC 

n 

-2> 
i=1 

CTp 

V* 

Rejection 

Rejection 

\/m(1 - m) 

Blind 

stratification 

k rij 

EE“ 

i=i *=i 

X /(Oj-I + (aj - 


E“/"’ 

■j=l 3 

k 1 / /“J \ 2 1 1/2 

-E^-( / /(x H 

Quasi MC 

n 

n 

i-i 

fe^logn) 6 

n 

Weighted 

MC 

r n -1 

- - Vi)(f(Vi) + fivt+i)) 

*-i=0 -* 

M/n 2 

Informed 

stratification 

Same as for blind stratification 

Same as for blind stratification 

Importance 

sampling 

r dG( X ) 

Jo 9( x ) 


Control 

variates 

f 4>{x) dx+ f [f(x) - 4>(x)} dx 

Jo Jo 

\/var(t) 4 var(f') — 2cov(t, <') 

Antithetic 

variates 

(* + 0/2 

\ 

- var(t) 4- - var(t') 4 - cov(t, t') 

1 4 4 2 


M8URI 7*9 

A summary of Monte Carlo methods. 





7.9 Further Reading 


329 


are both true depending on the function being integrated, the quality of the im¬ 
portance function, and the quality of the stratification. It’s probably true that if 
we have a good idea of the function’s shape, importance sampling will lead to a 
faster solution than informed stratification, but when nothing is known about the 
integrand, blind stratification will usually be superior to importance sampling with 
an arbitrarily guessed importance function. This is an area where experience with 
the particular function being sampled is of great value. Many of the techniques in 
Chapter 10 are the result of different practitioners’ approaches to this challenging 
engineering problem. 


7.9 Further Reading 

A very early and explicit discussion of Monte Carlo methods in practice may be 
found in Cashwell and Everett [76]. 

Much of this chapter is based on the excellent book by Hammersley and Hand- 
scomb [183]. This is an ideal starting point for further reading on both theoretical 
and practical issues. Another good book for study and reference is the volume by 
Kalos and Whitlock [239]. The book by Spanier and Gelbard [415] is much more 
advanced in some areas and offers more detail. 

The classic paper by Halton [182] is a difficult but complete survey of work up to 
1970. Different quasi-Monte Carlo patterns were studied extensively and compared 
by Warnock, who has provided a wealth of comparative data [471]. An extensive and 
detailed (but difficult) discussion of quasi-Monte Carlo and pseudorandom numbers 
has been presented by Niederreiter [319]. 

There are many reports on Monte Carlo work in the physics and nuclear engi¬ 
neering literature, where accurate simulation of complex phenomena like those in 
computer graphics has received a lot of careful scrutiny. Pointers to these reports 
may be found in the books above. 


7.10 Exercises 

Imrciss 7.1 

Using a standard random-number generator, take random samples of the function 
x 2 over the interval [0,3]. What is the true mean of this function? Plot the estimated 
mean as a function of the number of samples. 

htfdM 7.2 

Using a standard uniformly distributed random-number generator plot the absolute 
error in the estimate of the integral of f(x) = x 2 in the interval [0,3] as a function 
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of the number of samples taken, using the estimators below. Plot the error for 1 to 
100 samples for each estimator. 

(a) Averaged random samples. 

(b) An importance function g(x) = x. 

(c) An importance function g(x) = x 2 . 

(d) An importance function g(x) = x 3 . 

(e) An importance function g(x) = 3x 3 . 

(f) Five uniform strata. 

(g) Ten uniform strata. 

(h) Twenty uniform strata. 

IxgtcIm 7.3 

Describe how you would apply the method of control variates to finding the integral 

J s\n(t) 1 + cos(e~ t2 sin t )j (7.74) 



Many of the sleights in this book are presented 
in two segments. For example, you begin a 
vanish called the French Drop by showing a 
small object in your left band and removing it 

with your right _ After you have learned to 

do that smoothly and well, you are ready to add 
the secret move which will make the coin 
vanish. This time you do the same moves in the 
same natural manner, but at the right moment 
you secretly allow the coin to drop into your 
left palm while your right continues on as 
though still holding it. 

Bill Tarr 

(“Now You See It, Now You Don't!," 1976) 



UNIFORM SAMPLING AND 
RECONSTRUCTION 


8.1 Introduction 


In the introduction to Unit II, I presented aliasing as a motivating reason for the study 
of signal processing in computer graphics. The best way to discuss aliasing is to look 
at what happens in the frequency domain when a continuous signal is sampled 
(turned into a discrete signal) and reconstructed (turned back into a continuous 
signal). Our discussion of the Fourier transform in both continuous and discrete 
time has given us the tools to look at frequency space and the effect of different 
signal-processing operations in that space. 

Many of our input signals in computer graphics are conceptually continuous. We 
can argue that all physical signals are ultimately quantized to the Planck constant 
of the universe, but at least some of the signals used in computer graphics are 
mathematically continuous, such as the surface of a sphere or polygon. 

Most of computer graphics works with discrete versions of these aperiodic, con¬ 
tinuous signals. This is because most of the operations we want to perform on these 
signals become complex if we try to perform them analytically. The transition from 
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PI8URI S.l 

(a) The average color in this pixel is the sum of the colors of each fragment, weighted by its area. 

(b) A pixel containing curved objects and textures. 


continuous to discrete involves taking a series of sample values of the signal. If these 
values are taken at equal intervals, we say that the sampling is uniform . 

We will focus exclusively on uniform sampling in this chapter. We begin with two 
examples showing how sampling and reconstruction problems occur in computer 
graphics. 


••1.1 Sampling: Anti-Aliasing In a Pixel 

Consider a simple scheme for anti-aliasing polygonal scenes, in which we want to 
find a single color to represent each square pixel. One common approach is to 
average together the colors of all the polygon fragments visible in the pixel, weighted 
by their individual areas. Anything not covered by a polygon fragment is assumed 
to be part of the background, with its own color. An example of this approach is 
shown in Figure 8.1(a). 

If there are n fragments (including the background), each with color C* and area 
A*, then we write the average C color as 


n—1 

c = j2 CiAi 

i=Q 


( 8 . 1 ) 
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Evaluation of this equation requires only knowing the area and color of each frag¬ 
ment, so it is equally valid for any kinds of objects in our pixel. Let’s suppose we 
also support arbitrary curved surfaces. 

To evaluate Equation 8.1, we need values for both the color and area of each 
fragment. Both of these values can be difficult to evaluate. Suppose that each surface 
has an irregular shape, and contains a complex texture, as in Figure 8.1(b); finding 
the average color requires integrating over the visible fragment of each surface. 
Obtaining an analytic expression for the visible area of the fragment may be difficult, 
and then analytically integrating the texture over that area can also be difficult, even 
assuming that an analytic expression for the texture is available. The problem gets 
even harder if the objects are moving, requiring our averages of area and color to 
also include a temporal component. 

This type of analytic anti-aliasing can work in simple rendering systems where 
the geometry and color components are simple, such as a smooth-shaded polygon 
rendered but as we have seen, it quickly becomes intractable when the objects move 
and their surface colors become complex. 

Therefore, the analytic approach is useful in situations where the signal being 
imaged is relatively simple and well understood, such as flat polygons and text, but 
it is rarely used in general-purpose rendering systems because of the difficulty (or 
impossibility) in finding the necessary analytic expressions. 

A popular alternative to this analytic approach is to approximate the various 
values in Equation 8.1 numerically. We take a number of point samples within the 
pixel, and try to guess the values of n, C*, and from those samples, as shown in 
Figure 8.2. Here we have nine points C;, each representing an area A{ = 1/9, so we 
can approximate C as 




c lL 

9 


( 8 . 2 ) 


The number of samples we need to get good estimates depends on the spatial 
distribution of the samples in the pixel, the quality of our approximation method, 
and the complexity of the scene within the pixel. When we have taken a sufficient 
number of point samples, we can use them to synthesize a new, continuous-time 
signal. 

The advantage of the point-sampling approach is that in many situations, point 
samples may be taken from signals that are too complex to represent analytically. If 
we relied on the analytic technique alone, when the complexity passed a certain limit 
we would have to give up and tell our users that we simply cannot render certain 
types of scenes, or start making approximations that may be unacceptable. Using a 
discrete version of the signal, even if each sample is very expensive to compute, we 
can still evaluate the signal and process it. 
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PIOURI 8.2 

Taking point samples within the pixel to evaluate its contents. 


8.1 m2 Reconstruction: Ivaiucrting Incident Light at a Point 

To illustrate the need to synthesize a new, continuous-time signal from a set of 
samples, suppose we want to find the light incident upon a point on an opaque 
surface. The light may be considered a continuous-time 2D signal /(0, 0) defined 
everywhere on the hemisphere centered at the point and covering it and the surface, 
as in Figure 8.3. 

The description of the incident light can depend on every object in the scene, and 
the complex interactions of light between those objects before they reach this point. 
Finding an analytic expression for this signal may be impossible; if possible, it would 
probably be horribly complex. So we approximate f(0 , 0) by taking a number of 
point samples as we did for the scene under a pixel, creating a sampled signal g[n ], 
defined only for n different points (0„, 0„). 

We now want to find how this light interacts with the surface. We will assume 
for the moment a common model that describes the reflection properties of a surface 
with a function r(0, <j>) that provides the intensity of the light reflected from the 
surface for each direction ( 0 , 0) on the hemisphere, into some particular direction of 
interest, as in Figure 8.4. (We’ll ignore color right now for the sake of simplicity.) 

The reflection function r(0, 0) is a continuous-time function in two parameters. 
If we simply apply r to each of our samples and sum, we will be ignoring all the 
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PIOURI 8*3 

The light g(0, <j>) striking a point comes from its enclosing hemisphere. 



PIOURI 8.4 

The light from each incident direction (0, <f>) is reflected with intensity r(0, </>). 
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light information between the samples. Our samples of the incident light are just 
representatives of the complete description; if we are lucky, they approximate the 
average light in their general area. To find the total description of the incident light, 
we need to reconstruct the continuous-time signal g(0,<j)) from its discrete version 
g[n\. We can then multiply the incident light and the reflection together; finding the 
total light v reflected from the surface in some direction: 


v = 



g(0, <f)r{6 , (j)) dO d(j> 


(8.3) 


8.1.3 OuHino of this Chapter 

We will have much more to say about practical techniques for anti-aliasing later. 
Our purpose behind this discussion was to demonstrate that in computer graphics 
we need to both sample and reconstruct signals to do our work, and to get into the 
mood for discussing anti-aliasing. 

Aliasing can creep into our system during both the sampling and reconstruction 
steps. Our goal will be to find those conditions under which we can take a signal, 
sample it, and then reconstruct exactly the same signal by combining just the samples. 

We will first look closely at sampling, detailing what happens when we take a 
continuous, aperiodic signal and sample it at equally spaced intervals, creating a 
new, discrete signal. It turns out that the Fourier transform of this discrete signal is 
always periodic; this forces us to interpret the sampled signal as just one period of a 
continuous, periodic discrete signal. 

If we want to reconstruct our original aperiodic signal, then we need to somehow 
first make the sampled signal periodic; this is the problem of reconstruction. We will 
see that there is a precise theoretical condition called the sampling theorem that tells 
us when this necessary periodic-to-aperiodic transformation may be made without 
error; when this condition is not met during either sampling or reconstruction, the 
reconstructed continuous signal contains unwanted energy that shows up as aliasing 
artifacts. 


8.1.4 Uniform Sampling and Roconsfruction of a 1D Continuous Signal 

We mentioned above that the Fourier transform of a uniformly sampled signal is 
periodic in frequency space. This is a very important result, and it is straightforward 
to prove. 

To sample or digitize a signal is to evaluate it at a series of values of its parameter. 
To sample a signal uniformly or periodically means that the parameters of the sample 
values are themselves taken from a periodic function. 
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In one dimension, we uniformly sample a continuous signal f(t) with a shah 
function s(t) = IHt 0 M °f period To to create a sampled signal g(nTo): 

g(nT 0 ) = f(t)s(t) (8.4) 

This may be drawn in a system diagram, as in Figure 8.5. The output of the system, 
g(t ), is defined as 0 at all values of t except nT 0 , where it is the value of /(£), as in 
Figure 8.6. 

We would like to find the Fourier transform of g(t). The convolution property 
of Fourier transforms tells us that multiplication in the signal domain is equivalent 
to convolution in the frequency domain, so 

G{u>) = F(u>) * S(u>) 

= F(w) * — III^w) 

K 

= F(u>)*—J2 6 (“~ k <**>) 

K k 

= —(8.5) 

K k 

where on the second line we substituted the Fourier transform of the shah function 
from Equation 5.83. 

This is the result promised at the start of this section. Equation 8.5 tells us that the 
Fourier transform of g(t) is a periodic repetition of the transform of /(£), repeated 
with a period of ujq. This is illustrated in Figure 8.7. 

In Figure 8.7, the spectrum of F(u) is bandlimited , which means that the spectrum 
has finite support. In other words, for all \u>\ > uF( u) = 0. The frequency ujf is 
called the cutoff frequency for the signal f(t). 

Equation 8.5 says that copies of F(u>) are placed at intervals of u;o, which is 
derived from the period To of the sampling shah function s(t) by uj o = 2i r/T 0 . When 
|u7o| > 2 wf, there is sufficient space between copies of F(u) that they do not overlap, 
as shown in Figure 8.8(a). When |u^o| < 2 Wf> then the copies of F(u) overlap with 
one another and sum together, as shown in Figure 8.8(c). 

Recall that we often want to reconstruct our original signal f(t) from its sampled 
version g(t). Stated another way, we want to reconstruct f(t) from its samples /(nT). 
In practice we will often modify the samples in some way before reconstruction, but 
recovery of the input signal from its samples is the simplest form of the problem and 
includes all the relevant issues. 

If we can somehow find F(u>) knowing only g(t)> then we can recover f(t) using 
the inverse Fourier transform f(t) = T~ l {F{ a;)}. Consider again Figure 8.8. When 
u>o < 2 u>f, the spectrum G(uj) contains multiple, identical copies of F(u>) at periodic 
intervals. But this is not the same as F(u>). The inverse transform for G(w) gives 



A sampling system. 




MOURI 8.7 


(a) F(w). (b) G(u/) = ^ F(u) — kuJo). 
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/v /v /v _ 

(a) (b) (c) 


FI0UR8 8.8 

(a) Because a;o > 2c jf, the copies of F(u;) do not overlap, (b) At exactly u;o = 2 u;f, the copies of 
F(u) just touch each other, (c) When u>o < 2 'up* the copies of F(u>) overlap and sum together. 



FIOU8I 8.9 

F(u>) = G(u>)B» f ( u). 


the signal g(t), which is zero everywhere but t = tiTq ; the extra copies of F(u) serve 
to suppress the information between samples of g(t). To get back f(t) y we need to 
isolate just the center copy of F(u). One way to do this is to multiply G{uj) with a 
box spectrum 5 ^( 0 ;), as shown in Figure 8.9: 

F(u) = G(cj)B u;F (a;) (8.6) 

and then we can recover f(t) from the inverse transform. The critical point here is 
that the central copy of F(u>) needs to be isolated; that means no other copies can 
overlap with it. 

The copies of F(u) are distinct only when uj 0 > 2 ujp. When this condition is not 
fulfilled, copies of F(a;) overlap and sum together. For example, when the sampling 
frequency is too low, and we then filter with the box, the value of G{u) for some 
u < up is not F(u;) but rather F(uj) + F(u f ) for some u r ^ c up, as illustrated in 
Figure 8.10. 
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When ujq < 2 uf, some energy at u/ ^ ujf adds itself to the energy at uj. 


Because the energy at u / adds to the energy at u;, we say that u/ is an alias for uj. 
When u>o < the sampled spectrum G(u) is said to contain aliases , or is aliased , 
and we will be unable to recover our original signal f(t) from its samples. When 
uj 0 > 2 u>f, then an isolated copy of F(u) exists and we can extract it. 

This observation is the celebrated sampling theorem for ID signals and uniform 
sampling. We state it here as 

The ID Uniform Sampling Theorem (first half): 

A bandlimited signal f(t) with cutoff frequency up may be perfectly 
reconstructed from its samples /(nT 0 ) if 27t/T 0 > 2ujp. 

The frequency ujf is called the Nyquist frequency for the signal; the sampling 
rate T 0 is called the Nyquist rate . If a signal is sampled less often than required by 
the sampling theorem, we say that the signal is undersampled . Similarly, if a signal 
is sampled more often than necessary, it is oversampled. 

The only requirements imposed by the sampling theorem are that f(t) be band- 
limited, and that we sample with a frequency at least twice as fast as the highest 
frequency in f(t). 


8.1.5 What Signals Art Bandiiariftad? 

As we mentioned earlier, the types of signals typically encountered in computer 
graphics have compact support. For example, the image of a polygon has definite, 
sharp borders. What does this mean in terms of the sampling theorem? A strict 
interpretation says that any signal with compact support cannot be correctly sampled 
and then reconstructed because a signal cannot simultaneously have finite width (i.e., 
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have compact support in signal space) and be bandlimited (i.e., have compact support 
in frequency space). The rule of thumb is that increasingly sharp edges in a signal 
require increasingly high frequencies. 

To see this, consider any signal x(t) such that x(t) = 0 for all \t\ > W/2. We can 
write x(t) as the product of itself and a box that just encloses it, since the box is 
unity where x(t) is defined, and zero elsewhere: 

x(t) = bw(t)x(t) (8.7) 

The Fourier transform of bw(t) is given by Equation 5.15. Thus the Fourier trans¬ 
form of Equation 8.7 is 

X B (u) = kW sine * X(u) (8.8) 

Since the sine function has infinite width, when we convolve X (u) with the sine in 
frequency space, as long as X(u) has at least one nonzero value, the left-hand side 
of Equation 8.8 has infinite extent. Our only assumption was that x(t) had finite 
width. Thus if x(t) has finite width, its Fourier transform has infinite width. 

This is another reason why aliasing problems are so prevalent in computer graph¬ 
ics. Our signals typically have finite width: for example, a sphere of some finite 
radius, a pixel of some size, or a texture of some given width and height. Even if 
we deal with continuous representations of these objects, when we sample them we 
are giving up any hope of recreating them without error; the very fact that they have 
finite extent means that no finite number of samples will ever perfectly capture their 
edges, which require arbitrarily high frequencies. 

We can ameliorate this problem somewhat by treating each finite signal as one 
period of an infinite, periodic signal. Since the signal is now considered infinite, we 
can hope to capture the signal with a finite number of samples. This will be our 
approach in later chapters. 


8.2 Reconstruction 

We now turn to the problem of recovering f(t) from its samples f(nT). 

The sampling theorem for uniformly spaced samples says that as long as the 
sampling rate is at least twice the highest frequency in the signal, the signal can be 
recovered from its samples. This recovery process is called reconstruction . 

If the sampling theorem is met for some signal f(t ), then we can reconstruct it 
by applying a perfect low-pass filter with width ujf to G(u;), as in Equation 8.6. 
An important practical observation is that multiplication in the frequency domain 
is equivalent to convolution in the signal domain. So the recovered signal with 
spectrum 


F(u>) = G(u>)R(u>) 


(8.9) 
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may be computed in the time domain by Equation 8.6 as 


fit) = g(t) * r(t) 

= 22ff(t)r(t-nT) 

n 


( 8 . 10 ) 


for some reconstruction filter r(t ). 

Equation 8.10 is an interpolation formula that tells us how to derive new values 
for f(t) between sample points /(nT), using the filter r(t). In Equation 8.6 we 
multiplied in frequency space with a box with cutoff frequency up- We repeat here 
the inverse Fourier transform for a box spectrum from Equation 5.73: 

bit) = {S wp (w)} = kuf sine ( 7 ^) (8.11) 

If we use the box for filtering in frequency space, then our reconstruction filter 
r{t) = b(t), so 

fit) = ^2 f(nT)Kw F sine (^(t - nT)) 

n 

= (Ku F )^f(nT)anc(^(t-nT)) (8 ‘ 12) 

n ** 

Equation 8.12 is called the bandlimited reconstruction formula , because it tells 
us how to reconstruct any correctly sampled bandlimited signal from its Fourier 
transform, using the canonical reconstruction filter sinc((u;F/27r)(£ - nT)). 

Recall that when a signal is bandlimited, the signal itself has infinite extent. This 
condition is satisfied by periodic signals. 

We can now state the full Uniform Sampling Theorem: 


The ID Uniform Sampling Theorem: A bandlimited signal 
f(t) with cutoff frequency Mp, sampled with frequency 
such that 2 tt/Tq > 2 may be perfectly reconstructed from 
its samples /(uTq) by convolution with the reconstruction 
^ ter 

r(t) = sine (~(* - nT 0 )J 


(8.13) 


Equation 8.12 tells us that we can reconstruct the signal fit) by working entirely 
in the spatial domain. We place a copy of the sine function at each sample location 
nT, and scale it by the sample height f(nT ) at that point. The sum of all these scaled 
sine functions is the original signal f(t). An illustration of this technique is shown in 


8.2 


» 


i 
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MOUftl 8.11 

Reconstructing a signal from a sum of sine functions. 


Figure 8.11. Note that the sines go through zero at every sample point but the one 
they are centered over. 

So to reconstruct a signal / from its sampled version g, we have two options, 
one in each domain. In the frequency domain, we may find F = GB; the Fourier 
transform of the signal is equal to the periodic spectrum of G times a box filter B, 
which isolates just the center copy. In the spatial domain, we may compute / = g * 6, 
where we convolve the sampled signal g with the inverse transform of the box 
function b (which is a sine). Both of these approaches are useful conceptually. One 
will usually be computationally cheaper than the other in most practical situations; 
the choice usually depends on which representation of the signals is most easily 
computed (or already available), and whether the output needs to be in signal or 
frequency space. 

We said earlier that when the sampling rate is too low, copies of F(us) will overlap 
each other, so some energy from above the cutoff frequency will leak into the central 
copy (or alias), disrupting our attempts at reconstruction. Aliasing can also occur 
if we reconstruct improperly. For example, the spectrum F(uj) of some signal and 
the spectrum of its sampled version G(cj) are shown in Figure 8.12(a) and (b). If 
we make a poor choice for the reconstruction filter B(cj ), say a box with width 2 up 
as in Figure 8.12(c), then the reconstructed signal will not match the input, since 
^ F(u>). 

In this case the problem is not aliasing in the strict sense, since we do not have en¬ 
ergy from above the Nyquist limit leaking into the central copy. In fact, the sampling 
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(a) The spectrum F(u>) of an input signal, (b) The spectrum of G(uj) when the sampling theorem 
is met. (c) A poor choice of filter B(lj). (d) The resulting spectrum G{u)B(u) ^ F(u;). 


theorem is met by Figure 8.12(b); it’s the reconstruction step that is introducing error. 
Unfortunately, sometimes in the graphics literature this effect is also referred to as 
aliasing. It is better to reserve the term aliasing for the effects due to undersampling, 
and refer to the effects of poor reconstruction as reconstruction errors . 

To show the effect of sampling rate on a signal, Figure 8.13(a) shows a signal f(t) 
made up of a fixed number of cycles of a sine wave (here we use f(t) = (sin(x) +1)/2 
so that 0 < f(t) < 1). Figure 8.13(b) shows the result of sampling that signal with 
shah functions of gradually decreasing period, and thus increasing frequency. As 
the period goes down, we have more samples within the finite interval within which 
the signal is defined. When the sampling frequency reaches the Nyquist limit, our 
samples are sufficiently close to capture f(t ), and further numbers of samples don’t 
improve our estimate; above that rate we are oversampling. We used a sine function 
to reconstruct each row of Figure 8.13(b). 


8.3.1 Ztro-Ordfr Hold Reconstruction 

A common hardware setup for displaying computer-generated images is the combi¬ 
nation of a frame buffer and a rectangular-grid-based display device. For the current 
discussion, we will assume that such devices display a constant-intensity signal be¬ 
tween samples; an LED display with a diffuser or a high-resolution print image may 
match this assumption well. A CRT probably would not, because the color at each 
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FIOURI 8.13 

(a) f(t) = (sin(x) + l)/2. (b) Reconstruction of f(t) after sampling with different shah functions. 
The frequency of the sampling function increases along the vertical axis. 


point is ideally represented by a Gaussian bump and not a flat field of color, as 
discussed in Chapter 3. However, the assumption that the signal is constant between 
display centers is much easier to work with, which is why we use it here. We will 
focus our discussion on a single scan line of pixels, though in two dimensions there 
can be interactions between adjacent scan lines. 

If we only store color values at the centers of pixels, then we are essentially letting 
the display device fill in the signal between pixels with whatever intensity is generated 
between one pixel center and the next. Under our assumption, the reconstructed 
signal r(£) between two samples r(nT) and r(n(T+ 1)) is just r(nT). This is called a 
zero-order reconstruction , or zero-order hold , and is illustrated in Figure 8.14 [282]. 

We can describe the system with a system diagram like Figure 8.15. When our 
rendering is complete, we have built an estimate of the image f(t) that we would 
like to display. We know that we will show this on a device with interpixel spacing 
p, so we make sure that f(t) is bandlimited to ujf < 7r/p. We then sample the signal 
with a shah function s(t ), which has an impulse at the center of each pixel, resulting 
in a sampled signal g(t) = f(t)s(t). 

The zero-order hold may be modeled by a filter with an impulse response h p w 
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FIOVIE 8.IS 

A model of zero-order hold. 


such that 

{ 0 t < 0 
1 0 < f < p 

0 p <t 

When we apply this to g(t), we get the display signal d(t): 


(8.14) 


d{t) = g{t) * h p (t) 


( 8 . 15 ) 
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PIOURI 8*16 

A 2D shah function with periods T\ and T 2 . 


8.3 Sampling in Twe Dimensions 

We now turn our attention to sampling in two dimensions. The 2D sampling theory 
is very similar to the ID case. The difference is simply in the complexity of the 
notation. Each equation gets a bit more than twice as complicated, because many 
symbols now appear with two different names (one for each axis) and they must be 
kept distinct. The ideas in this section parallel those in previous sections, though the 
equations are much busier. 

Let’s suppose that we are given a continuous-time 2D function f(x , y), sampled 
on a rectangular grid with periods ( Ti , F 2 ), as in Figure 8.16. This forms the sampled 
signal /(mTi, nT 2 ) = g[m , n]. Under what conditions can we recover / from g ? 

To express g , we start with a sampling signal s[m, n] with periods Ti and T 2 , 

s{m,n\ = y^y^<5(x-mT 1 ,j/-nT 2 ) (8.16) 

m n 

and then multiply s with / to form g. 

g[m, n ] = f(x, y)g[m, n] 

= ^2^2 f(x,y)6(x - mT^y - nT 2 ) 

m n 

= f(mT u nT 2 ) (M2) 

We want to find T{g). Since g[m,n] — {G{g.,v)} — f(mTi,nT 2 ), we can 
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FIOURI 8.17 

Tiling the infinite plane with squares 2n on a side. 


simply find the 2D CTFT transform of / and then sample it appropriately. We begin 
with the 2D CTFT of /: 

g[m,n] = f{mT u nT 2 ) 

= ?-'{? {f(x,y)}} 

•hJJ* ,v)e j{mTlfl+nT2 ‘ /) dfidv 

= hll T^T 2 F{Xl/Tl ' X 2 / 7 2)e j(mAl+nA2) dA 1 dA 2 (8.18) 

using the substitutions 

\\ = fiTi and X 2 = uT 2 (8.19) 

The double integration covers the entire plane. We will break up the plane into an 
infinite collection of squares, each on a side, as shown in Figure 8.17. These 
squares cover the plane with no gaps or overlap, so finding the integral over each 
square and summing the result will give us the same value as integrating over the 
plane. We will place one square centered at the origin, and simply abut the rest in 
rows and columns. 

We write SQ(k u k 2 ) to indicate the square with origin at (k u k 2 ). Breaking down 
the integral above into these squares gives 

II ^T_F(Ai/T,, A 2 /T 2 )e^ mA ’ +nA *> dXi d\ 2 (8.20) 

fcl fc2 SQ(kiM) 
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We now make another set of substitutions to remove the awkward integral. We 
can set the integrals to cover just one square and choose the limits of the square with 
the arguments in the function by substituting 

r] = Xi + 2nki and £ = A 2 + 27r&2 (8.21) 

Plugging these in for Ai and A 2 , expanding the exponentials and then collecting them 
again, yields 




-j(2’nkim+2nk2n) 


drjd £ 


( 8 . 22 ) 

This equation is certainly a monster, but it can be tamed. The last exponential is 
identically 1 for all integer values of k u ra, and n, giving us 

r e 4 £ 

= f f G[r 1 ,Z}ei (mr ' +n V d,T)d(, 

J — 7T J —7T 




(8.23) 


We have now found the Fourier transform of the discrete signal g[m, n], derived 
from sampling the continuous-time signal f{x,y). Remember that our goal is to 
recover / from g . We will find this is only possible under certain conditions. To 
find those conditions, it will be very useful to consider just what G[rj,£\ represents 
in terms of F(/z, z/). 

From the derivation above, we write G[rj, £] as 






/ r]27rki ^2^2 \ 

V—’“it; 


(8.24) 


This shows us that G contains an infinite number of periodic replications of F 
at intervals of (27r/Ti,27r/T2), as illustrated in Figure 8.18. The centers of the 
replicants lie on a square grid with one point at the origin and the others at vectors 
(fci27r/Ti, k22n/T2). 

So G contains F at the center, plus many copies at regular intervals in both 
directions. If we can isolate the one copy of F lying at the origin, then we can take 
its inverse transform and achieve our goal of recovering / from g (here, via G ). 

Under what conditions can we isolate the center copy of F from within G? 
Figure 8.19 shows that we can draw a square grid around the replicant centers. 
Each square has width 2n/T\ and height 27r/T 2 , with one square at the origin and 
others surrounding it. 
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PlOUftl I.1S 

The spectrum of G contains an infinite grid of replications of F. 



T i T i T i T i 


3n!T 2 
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-k!T 2 
-3 n!T 2 


PlOUftl 9.19 

The squares of isolation in G. 
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(a) The spectrum F fits within the limiting square in G. (b) The spectrum F does not fit within the 
limiting square, so copies of F in G overlap and sum. 


If the spectrum of / fits within one of these squares, then the various replicants 
will not overlap, as in Figure 8.20(a). But if the spectrum of / exceeds the limits of 
the square, then the replicants will overlap and sum together. This will happen at 
every location, including the center one, which will ruin our chances of recovering a 
pure copy of F. This overlap is shown in Figure 8.20(b). 

This overlap is 2D aliasing. If the spectrum of / is outside of this box, then we will 
be unable to sample / on this grid and then get / back later. Of course, we can change 
the grid, making the boxes larger by placing the samples more closely together. Since 
the box sides are (2ir/T \, 27t/T 2 ), we can enlarge the boxes by shrinking the sampling 
interval of the sampling array of impulses s , and thus the density of q. But for any 
regular grid there will always be a box and F will have to fit within it. 

We say that / is bandlimited if its spectrum is zero beyond some finite range. We 
can formalize this in the 2D sampling theorem for uniform samples, which specifies 
the Nyquist limits in the x and y directions for the 2D spectrum F(/z, i^): 

The 2D Uniform Sampling Theorem (first half): A bandlimited 
signal f(x,y) with cutoff frequencies \xp and up may be perfectly 
reconstructed from its samples /(mTi,nT 2 ) if 2tv/T x > 2 jx F and 
27 t/X 2 > 2iyp. 

If F satisfies the conditions of the 2D sampling theorem, then it is sufficiently 
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bandlimited that we may write G within the center square as 

G[g,T u uT 2 ] = for M < */Ti and \v\ < tt/T 2 

We may simply invert Equation 8.25 to recover F from G : 


F(ix,v) 


27tTiT 2 G[^Ti, vT 2 \ \fi\ < n/T\ and \v\ < n/T 2 
0 otherwise 


(8.25) 

(8.26) 


8.4 Two-Dimonsionai Reconstruction 

The 2D sampling theorem tells us that when F is appropriately bandlimited, we can 
recover /(#, y) from a sampled version f(mT\,nT 2 ). In this section we present the 
mechanics of this reconstruction . We will see that it again closely parallels the ID 
case. 

We apply the inverse Fourier transform to the central square of G[r), £]. The 
expressions will quickly get very busy again because almost everything appears 
twice, but as in the last section the ideas are almost the same as in the ID case. To 
make the notation a trifle simpler, we will use the substitutions 

W x = 7r /T\ and W 2 = tt/T 2 (8.27) 

We start with the definition of the inverse transform, narrow the range of integration 
to the center square, and substitute the value for G from Equation 8.25: 

f(x,y) = JJ F(n,v)e H,SI+l ' y) dndv 

i f w, f w 2 

= — / / 27rTiT 2 G[MT 1 , t/T 2 ]e-'^ x+1/y) dy dv 

Z7T j-w x J-W 2 

i rW x f W 2 

= — / 27rT l T 2 F{g[n,m}}e j ^ x+uy) dfidu (8.28) 

2n J_ Wl J-w 2 

Now we can expand the Fourier transform of g explicitly to find: 
i r w, r w 2 f i 

f(x,y) = 7T / 2ttTiT 2 — y^Y]p[m,n]e- 7(MTin+I/T2n) e 2{ ' LX + vy) dfidv 
2t tJ_ Wi J_ W2 [27t ^ ^ 

i r w, r w 2 

/ / e j[n( T im-x)+is(T 2 n-y)) d^dv 

2n m n J-Wi J~W 2 

1 r W x rW 2 

= T x T 21 l-Y'Y'g[m,n] e j ^ {Tim - x) Ufi / e jHT 2 n-y)} dv 
2n nn J ~ W ' J~W 2 


(8.29) 
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PI9URI 8.31 

The 2D reconstruction filter r(x,y; to, n) = sine [^(x — Tito)] sine [^(y — T^n)]. 


There is no question that Equation 8.29 is another monster. But notice the 
integrals on the right are in the form of property El 9, so we may rewrite them: 


47T 2 1 r . sin [(x — Tito)WT] sin f(y — T 2 TC)W 2 ] 

{x . Tim) -(7 -T 2 n) 


= 27r SS 5 t m ’ n l 

to n 


sin [(x — Tim)W\] sin [(y — T 2 n) 1V 2 ] 
W x (x-T im ) W 2 (y-T 2 n) 


= 27r S£^t : 


m, nj sine 


W\ 

— (x-7\m) 

7T 


sine 


W 2 

— (»-T 2 n) 

7T 


(8.30) 


This completes our reconstruction of f(x , y). The reconstruction filter r(x, y; to, n) 
is given by 


r(x,y;m, n) — sine 


Wi 


(x — Tito) 


sine 


Wo 

-(y-T 2 n) 

7T 


(8.31) 


and is plotted in Figure 8.21. Notice that r is not radially symmetric; it is the product 
of two orthogonal sine functions. 

We can now state the full 2D Uniform Sampling Theorem: 
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The 2D Uniform Sampling Theorem: A band limited signal 
/(.r, j/) with cutoff frequencies fij> and Up sampled with fre¬ 
quency T\ and T% such that 2w/T\ > *2f.t F and 27r/T% > 2vp 
may be perfectly reconstructed from its samples /{mTi.nTu) 
by convolution with the reconstruction filter 



Wi, „ 1 


'W 2 , „ 

r(x, y: m . n ) — sine 

—(jj-Tjm) 

sine 

—( y - Ti ») 


Jt 




So to find the value of f{x 0 , yo), conceptually move the reconstruction filter so its 
center is at (xq, yo)> and multiply it with the sampled signal g. The sum of the point- 
by-point product of these two functions times 2n yields the value of f(x o, yo). This 
procedure isn’t practical, however, because the sine function has infinite support, 
and g is infinitely periodic in both dimensions. Thus, we would need to carry out 
the products and summations to infinity in two directions to find the correct result. 

Notice that Equation 8.30 has the form of a double convolution sum. This 
confirms the general approach. As we discussed for ID reconstruction, filtering a 
spectrum with a box is equivalent to convolving the signal with a sine. 


8.5 Reconstruction in Imago Space 

As an example of the importance of proper reconstruction, we will return to our 
discussion of pixel colors from the start of the chapter. We will show that simply 
averaging the sample values is a poor idea, and suggest a better route. 


8.5.1 Iht Box RMorntruction Plltor 

Recall Equation 8.2. To derive that equation, we reasoned that there were nine 
uniformly distributed samples in the pixel, each representing an equal amount of 
area. So we simply weighted each one by 1/9 and added them together. 

Consider this now in terms of reconstruction; this simple averaging is not equiv¬ 
alent to convolving each of the nine samples with the necessary sine function. Since 
we’re not satisfying the requirements of the sampling theorem, this process cannot 
recover the signal correctly. Let’s look at what signal this process of approximate 
reconstruction actually does synthesize. 

To make the presentation simpler, we will rephrase the discussion in one dimen¬ 
sion. We assume some underlying signal /(£), which we sample with a shah function 
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st 0 (£), as m Figure 8.22(a) and (b); their product g(t) = f(t)sT 0 {t) is equivalent to 
our point samples in a pixel. We will say that our ID “pixel” spans the interval 
T = [—2To, 27o]. To make sure we have no aliasing, we’ll pick f(t) = cos (at), with 
a = (27t)/(3To) (any value of a < tt/To will guarantee accurate sampling). 

Our simple reconstruction scheme adds up all the values in the pixel and averages 
them; this is equivalent to ijjqijiplyij^ g(t) with a box whose width covers one pixel, 
h(t) = &4T 0 (*)> as shown in Figure 8.22(d). Note that each box will contain either 
one or two samples, depending on where the pixel boundaries are, because we’re 
sampling at 2/3 of the Nyquist rate. 

To find the reconstructed signal, we shift the box to each sample and scale it, 
resulting in the reconstructed signal h* fs in Figure 8.22(e). Resampling by the pixel 
samples in Figure 8.22(f) gives 


r{t) =p(t) [h(t)*g(t)} (8.33) 

as shown in Figure 8.22(g). Finally, we convolve again with the device’s characteristic 
display function m(t) in Figure 8.22(h), giving us d = m* {p[h * (/$)]}, as shown in 
Figure 8.22(i). 

The new signal d(t) certainly has a value at each pixel center, but it seems unlikely 
that we have reconstructed correctly, since we have not followed the sampling theo¬ 
rem and convolved with a sine. In fact, we convolved with the inverse transform of 
a sine instead. To see what d(t) represents, let’s find its Fourier transform D(w). 

Figure 8.23 shows the Fourier space representation of the signals in Figure 8.22. 
We know that the transform of a shah is another shah: 

SM = ^IIL>) (8.34) 

The Fourier transform of a cosine is easily found. Since cos(af) = (e jQt + e~ JQt )/2, 
then by linearity its transform is 


f e jod I e -jat \ 

^{cos(at)} = F |---1 


= -(S{u;-a) + 6(u; + a)) 


(J05) 


This seems reasonable; to make a signal cos(a7), we need only add the two complex 
exponentials e JQt + e ~ jat ; the imaginary parts cancel each other out and we’re left 
with the real cosine term. 

Since multiplication in one domain matches convolution in the other, we know 
that g(t) = f(t)s(t) is equivalent to G(u>) = F(u>) * S(uj). 




M O II ft I ft.22 


One-dimensional box filtering in signal space, (a) The original signal f(t). (b) A sampling impulse 
train s ro (t). (c) The sampled signal f{t)s ro {t ). (d) A box reconstruction filter, (e) The reconstructed 
signal, (f) A resampling impulse train, (g) The resampled signal, (h) The device display function, 
(i) The displayed signal. 




PI O II ft I 8.13 

One-dimensional box filtering in frequency space, (a)-(i) The Fourier transforms of the signals in 
Figure 8.22(aH0» respectively. 
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Similarly, we know from before that the transform of a box is a sine, so we can 
find H(u) from h(t) as 


H(u) = 4To k; sine 



(8.36) 


as shown in Figure 8.23(d). The convolution in space is then multiplication in 
frequency; our reconstructed signal HG is in Figure 8.23(e). Then resampling with 
the pixels P shown in Figure 8.23(f) leads to the resulting 


B(u>) = P(uj) * [H{u)G(u>)] (8.37) 

shown in Figure 8.23(g). Finally we multiply this with the display’s own built-in 
reconstruction filter to get 


D = M{P*[#(F*S)]} (8.38) 

Compare the original spectrum F(uj) in Figure 8.23(a) with the reconstructed 
spectrum in Figure 8.23(i). They are clearly not the same. There are many new high 
frequencies in our reconstructed signal that don’t belong. They die off in amplitude 
following a sine wave, but they never disappear. So even though our sampling was 
perfect, our reconstruction was not and we did not recover our original signal. 

This type of reconstruction is usually referred to by the unfortunate name of 
“box filtering” in computer graphics. This is somewhat misleading, since a box 
filter is exactly appropriate for multiplication in frequency space, but very poor for 
convolution in signal space. A more meaningful term for this approach would be 
“image-box filtering,” since that indicates that we are reconstructing with a box in 
image (or signal) space. 

Image-box filtering is not a total loss. The sine function in frequency space meets 
our basic qualitative criteria: the central copy of F(uj) isn’t attenuated too much, 
and higher copies are suppressed (scaled down in amplitude). It would be nice if the 
central copy was untouched and the higher harmonics completely damped, but the 
sine at least does part of the job. 

A box is probably the easiest reconstruction filter to program, but it is far from 
ideal; its finite support and sharp edges guarantee a wide Fourier transform, and 
thus leakage of high frequencies in the output. 


••5.2 OHiir Reconstruction PI Hors 

Given that the box is a poor choice for a reconstruction filter, and a full sine is 
impossible to implement, what other shape might perform better? This question 
immediately plunges us into the world of filter design. We will discuss filter approx¬ 
imation in more detail in Chapter 10. Right now we will just make some general 
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observations about the implications of different filter shapes and their effects on 
reconstruction. 

As we saw in the previous chapter, filters are often classified as having a finite 
impulse response (FIR) or an infinite impulse response (IIR). We have seen that 
the ideal ID reconstruction filter in frequency space is the box, but that the inverse 
transform of this finite-support spectrum is an IIR sine function. We cannot convolve 
a signal with this function in a practical system because it requires us to have access 
to the signal from negative to positive infinity. What we would prefer is a filter that 
comes close to the box in frequency space, but still has a finite, reasonable width in 
signal space. The basic difficulty is that, as we have seen, there is a natural inverse 
relationship between the width of a signal and its Fourier transform; the narrower 
one becomes, the wider the other spreads. We cannot really hope to find a very 
boxlike filter with a small and finite impulse response. 

Much of the field of filter design is aimed toward resolving this tension by pro¬ 
ducing FIR filters with good frequency selectivity. Because the process is inherently a 
trade-off, for each set of different desired characteristics there is a different approach 
and set of filters. There is no one filter design technique that is superior to all others 
in all applications. 

One practical method for making a good equivalent to a spectral box is to design 
a filter that drops off to a very small value outside of some interval in both spaces. A 
popular choice is the Gaussian bump. We can easily find the signal that corresponds 
to a Gaussian filter of any particular width. If we think of the Gaussian as dropping 
off “almost to zero” at some distance from the center, then we might simply assert 
that it is zero beyond some distance in both spaces, thereby approximating a signal 
with finite support in both spaces. This is illustrated in Figure 8.24. This is analogous 
to windowing the Gaussians with a box in each domain. The resulting signal and 
spectrum no longer form a Fourier pair, but if we choose the cutoffs carefully, they 
can be close. 

We will look more closely at reconstruction techniques when we survey practical 
signal-processing methods for computer graphics in Chapter 10. 


8.6 Supersampling 


It is common wisdom in computer graphics that you can reduce the aliasing artifacts 
in a picture by supersampling . Supersampling mean taking samples at a higher fre¬ 
quency then you expect to eventually resample at. For pictures, this means sampling 
more finely than once per pixel. 

A common supersampling method is to place an nxn grid of supersamples within 
each pixel, and then filter them into one value for that pixel. An example for n = 2 
is shown in Figure 8.25. 

This approach has a lot to offer: it’s conceptually simple and easy to implement. 
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(a) (b) 


FIOURI 8.24 

A Fourier pair of Gaussians and their cutoff points, (a) The time signal, (b) Its Fourier transform. 



• • 
• • 


FIOURI 8.2S 

Four different pixel centers for four separate images. 


The downside is that it can be very slow; for the n x n grid, the rendering time 
goes up quadratically with n. Many rendering systems use values of 2, 3, or 4 for 
n, and values of 8 and more are not uncommon. Therefore it is important that 
we understand just what happens as n increases, so we can use the smallest value 
required for a desired amount of aliasing reduction. 

For simplicity, we will consider the monochromatic ID case, where the image 
corresponds to the intensity along a scan line, and the basic sampling function 
corresponds to pixel centers. 

Suppose that the scan line image is modeled by a CT signal c(t). Almost certainly, 
c(t) will not be bandlimited, since it is defined by the models and the lights in the 
scene. There are a special few classes of signals for which we can create c(t) so that 
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A pixel grid and the sampling function that models it. 


we are guaranteed it is bandlimited [325], but such models are rare, and for most 3D 
models, c(t) will not be guaranteed to be bandlimited. We will assume, however, that 
the energy in C(uj) tends to diminish as uj increases; that is, the higher the frequency, 
in general, the less of a contribution it makes. This condition seems to be fulfilled 
by most realistic images. 

To model the rendering system we create a sampling function s p (t) y which is made 
of impulses spaced p units apart, equal to the distance between pixel centers: 

s p (t) = III p (t) (8.39) 

This is diagrammed in Figure 8.26. 

The result of sampling the signal c(t) is a DT signal d[n\ = c[pn], which we can 
display directly by assigning the value of d[n\ to pixel n. 

We know that 

nip(t) +A iii^/pM (8.40) 

so our sampling frequency u s = 2n/p. The sampling theorem tells us that the 
Nyquist frequency ujs for this sampling density is given by uj s = 2ojn, so u>n = n/p. 
This is the cutoff frequency for our pixel sampling; any energy above this frequency 
will not get sampled properly, and will show up as some sort of aliasing effect. 

Let’s supersample this signal and see what happens. Our model of supersampling 
is shown in Figure 8.27. We first sample the image function with an impulse train 
that takes several samples per pixel. This result is then reconstructed and filtered, 
and the new signal is resampled at the pixel rate. 





362 


8 UNIFORM SAMPLING AND RiCONSTRUCTION 



A model of supersampling. 



PI8URI 8.28 

A pixel grid and a supersampling impulse train. 


The supersampling impulse train s s (t ), which takes a samples in each pixel, is 
given by 

s 3 (t) = m p/a (t) (8.41) 

and is diagrammed in Figure 8.28. 

As in the pixel-rate case, the result of rendering is a discrete signal g[n] = c[np/a\. 
The new sampling frequency is u s = 27 ra/p, so the new Nyquist frequency is u>n = 
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FIOURI 8.29 

The reconstruction filter R(u;). 


an/p; the limit of useful information is a factor of a higher than for pixel-level 
sampling. 

We cannot display g[n] 9 though, since it is not matched to our pixel display 
function; we cannot assign all a values to any one pixel. To display g[n\> we first 
reconstruct it as a continuous-time signal and then resample that signal at the pixel 
rate. We will take the opportunity to filter the signal between these two steps to 
reduce aliasing. 

First we reconstruct the continuous-time function g(t) from the discrete-time 
function g[n\ by low-pass filtering the latter. For now, we will assume perfect filtering, 
and use a low-pass reconstruction filter R(u) with cutoff frequency uj = an/p, 

= b an/p (u) (8.42) 

as in Figure 8.29, and multiply it with the spectrum of g[n\: 

N(u) = G(Q)R(u>) (8.43) 

The next step is to low-pass filter N(cj ), so that when we sample it with the 
pixel-rate impulse train s p (t ), we won’t introduce any new aliasing. We saw above 
that the cutoff frequency for s p (t) is u>yv = n /p, so we can build a low-pass filter 
F(uj) from a box with this half-width: 

F{u>) = b n/p {u>) 

This filter is shown in Figure 8.30. 


(8.44) 
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FIGURE 8.30 

The low-pass filter F(u). 


We now have a new bandlimited signal B(u): 

B{ u) = F(u)G{u) (8.45) 

Since B(w) is correctly bandlimited, we can sample it with s p (t) with confidence that 
we won’t introduce any new aliasing errors. The result is a new display signal d s [n\, 
which is now matched to our display pixel rate. In signal and frequency space, the 
expressions for d s [n] are 

= Sp(t) ( f(t) * (r(t) * ( s s {t)c(t )))) 

D s [n] = S p (w) * {F[u)R(u) {S s (u) * C(u))) (8.46) 

corresponding to Figure 8.27. 

What have we gained by supersampling? When we sampled at the pixel rate 
to create d p [n], any information above n/p turned into alias artifacts. When we 
supersampled to create d s [n], we initially sampled at an/p. By reconstructing and 
filtering, we eliminated the chance that any information below an/p would turn into 
aliases. 

Thus any energy in the band n/p < u> < an jp that turned into artifacts in the 
pixel-sampled image has been correctly accounted for in the supersampled image. 
If the image c(t) indeed has generally less energy with increasing frequency, as we 
assumed above, then we have eliminated aliasing information where it mattered the 
most. As we increase the value of a, we remove more and more aliases from the 
image. 

It is important to note that by construction, 


F(u)R(v) = F{v) 


( 8 . 47 ) 
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FIOURI 8.31 

By construction, F(u)R(uj) = F(u). 


since F(u) is a box function that is completely contained within R( u), as shown in 
Figure 8.31. 

This is the reason we don’t usually explicitly reconstruct when implementing this 
technique; the perfect low-pass filter contains the perfect reconstruction filter for this 
operation. On the other hand, as we mentioned earlier, nobody implements perfect 
filters because they require infinite information. So the reconstruction, if performed 
explicitly, is not usually exact, and the low-pass filter also is not. 

As we discussed above, there is no one choice in signal space for an approximation 
to the impulse response of a perfect filter. We will discuss this in more detail later, 
but we note now that common choices include boxes, Gaussians, and clipped sine 
functions. Strictly, one should be sure that whatever choice is made for reconstruction 
and low-pass filters, the reconstruction step is only ignored if the two filters nest. 

If the low-pass filter is not completely (or at least substantially) within the re¬ 
construction filter, the two stages should be implemented separately. Otherwise the 
filtering will be imperfect, and aliases will show up in the reconstructed signal. 


8.7 Further Reading 

Bracewell’s book on Fourier theory [61] and the book by Oppenheim et al. [327] 
contain discussions of ID sampling and reconstruction that will be useful as an 
expanded introduction. Additional details may be found in the books by Oppenheim 
and Schafer [326] and Gabel and Roberts [151]. 

For extensive discussions of multidimensional sampling and reconstruction, see 
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the book by Dudgeon and Merserau [130]. They also discuss hexagonal sampling 
lattices in some detail. An extensive discussion of sampling and filtering issues 
in computer graphics is offered by Wolberg [485], who also provides a wealth of 
implementation information. A thorough discussion of the sampling theorem and 
various extensions to it is presented by Jerri [231]. 


8.8 Exercises 

Imrcis* 8.1 

Consider the image d(t) shown across one scan line of a display device after a band- 
limited signal f(t) has been sampled with a shah function s(t) and then reconstructed 
with a zero-order sample-and-hold impulse response h p (t). Writing g(t) = f(t)s(t ), 
we can write this equation in both signal and Fourier spaces: 

d(t) = g(t) * h p (t) 

D{uj) = G(u)H{u;) (8.48) 

A plot of d(t) for a given f(t) and an interpixel spacing W is shown in Figure 8.32. 

As Figure 8.32 shows, d(t) ^ f(t). We recall that when the reconstruction filter 
r(t) is a sine function of the appropriate frequency, then we can reconstruct f(t) 
exactly. 

To make our display match our function, we can insert a hypothetical filter with 
impulse response m(t) into the system just before the sample-and-hold, so that the 
series of filters m(t) * h p (t) = r(t). In symbols, 

f(t) = g(t) * m(t) * h p (t) 

= g(t) * r(t) (8.49) 

We know that when m(t) * h p (t) = r(t ), then reconstruction will be perfect. We can 
find m(t) by writing the system in the frequency domain, since we know both R(u) 
and H p (uj): 


F(u) = G(u)R(uj) 

= G(u>)M(uj)H p (uj) (8.50) 

so we can solve for M(lj): 

M(u;) = R(lj)/H p (uj) (8.51) 

(a) We might expect to see the filter M(u>) in every output device, so that the 
displayed signal d(t) would match the original input f(t). Is such a filter 
inside every CRT? If not, why not? 
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PIOURI 1.33 

A simplified system diagram for device display for Exercise 8.1. 
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(b) If you answered no to (a), then you could still consider implementing M(u>) 
(or its equivalent impulse response m(t)) in software in a rendering program. 
Do you know of any software incorporating this filter? Would you advocate 
its inclusion in software rendering systems? If so, carefully describe where and 
how you would implement it. If not, explain your reasoning. 

Imrcis* 8.2 

Suppose we have a Gaussian spectrum F(u) = e~ auj2 , and we assert that F(oj) = 0 
for all values of uj above some cutoff u; c , so that you had a new spectrum F'(uj) such 
that F'(\uj\ > u> c ) = 0. 

(a) Write an expression for F'(u) as the product of a Gaussian and a box. 

(b) Find the inverse transform of F'(u>). 

(c) Compare your answer to (b) with the inverse transform of F(u>). What can you 
say about the effect in the signal domain of windowing the filter in frequency 
space? What happens qualitatively to the spectrum as the cutoff frequency 
moves in toward the center of the hump? What does this require of the time 
signal? 

(d) Answer parts (a) through (c), but reverse the domains. That is, consider a 
time signal /(f) = e~ at , clipped to zero for all t > t c , so you had a new signal 
f'(t) such that /'(|f| > t c ) = 0. Find the Fourier transform of /(f), /'(f), and 
compare them as in (c). 

(e) Compare your answers to parts (c) and (d). Where would you window the 
Gaussian: frequency space, signal space, or both? 

(f) How would you apply a clipped Gaussian in signal space? 



Once is an instance. Twice may he an accident 
But three times or more makes a pattern . 

Didiie Ackerman 

(Preface, in “By Nature’s Design,” Par Murphy 
and William Neill, 1993} 



NONUNIFORM SAMPLING AND 

RECONSTRUCTION 


9*1 Introduction 

This chapter discusses the signal processing theory behind nonuniform sampling , 
which is the label we give to any sampling pattern that cannot be described as a 
regular lattice. Nonuniform signal processing has become important in computer 
graphics in recent years for two principal reasons: it offers us the chance to use 
variable sampling density , and it allows us to trade structured aliasing for noise. We 
will discuss these ideas in turn. 


9.1.1 Variable Sampling Density 

Consider the typical modern image-synthesis system. To generate an image, we 
need to estimate the image-plane signal. From a signal-processing point of view, this 
requires sampling the underlying continuous image signal at a rate above the Nyquist 
frequency. There is a tremendous range of image types and image complexities, and 
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FIGURE 9.1 

(a) An image we would like to sample, (b) The high sampling density required for the face induces a 
high-density pattern everywhere, (c) An adaptive pattern where density is proportional to intensity, 
(d) An adaptive pattern where density is proportional to local contrast. 


we don’t yet have a good measure to describe a “typical” image, or even to categorize 
the types of images. But most images made today, whether intended for use as stills 
or frames of animation, are typically not of uniform complexity. There are regions 
of great complexity (typically in the foreground, where there can be many objects 
and textures) and regions of relative uniformity (typically in the background or 
in shadows, where the image function is constant or only slowly changing). In 
computer graphics every sample is sufficiently expensive that we don’t want to waste 
even a single sample of the image function; each sample is very expensive to compute. 
Where should our samples be placed so they will do us the most good? 

Uniform sampling establishes a single, constant sampling density across the image. 
For example, consider Figure 9.1(a). If we want to capture the high-frequency 
information in the foreground of the scene, then we need a sampling rate that will 
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satisfy the Nyquist limit in that region. Because uniform sampling applies the same 
sampling pattern everywhere in the scene, the entire image plane will be covered 
with the pattern generated in response to that one region, as shown in Figure 9.1(b). 
This image has a smooth, flat background, so we will take many identical samples 
of this background, even though the frequency information in this part of the scene 
is very low. 

This seems wasteful, but the only technique we have discussed so far for increasing 
the sampling density anywhere is to increase the sampling density everywhere. Our 
intuition may say that in smooth regions, such as the background of Figure 9.1(a), 
we only need a few samples, widely spaced, in contrast to the dense distribution 
needed in the foreground. Such sampling distributions are shown in Figure 9.1(c) 
and (d). Although we have been discussing a concrete example, these issues apply to 
any signal. 

Note that there is no theoretical principle stopping us from placing and evaluat¬ 
ing samples anywhere we want. But completely unstructured sampling raises two 
problems: the first is to figure out how many samples we need in each region of the 
signal, and the second is what to do with the samples once we’ve evaluated them. 
These are the refinement and reconstruction problems, discussed next. 

We solved the refinement problem in the example by observation, but in general 
we would like an automatic method that will determine where a high sampling 
density is required, and where a lower-density distribution will do. This is called 
adaptive sampling . We will discuss some approaches to adaptive sampling below 
and in the next chapter. 

The second problem mentioned above is reconstruction : what we should do with 
the samples once they have been evaluated. When the samples aren’t on a regular 
lattice, we are faced with nonuniform reconstruction . Typically we will want to take 
our high-density samples and somehow reduce them to a single value representative 
of the region. In previous chapters we saw that to accomplish this with uniform 
samples, we could reconstruct a continuous-time signal from the samples, low-pass 
filter that signal to the Nyquist limit of the resampling grid, and then resample the 
filtered signal. This basic idea still holds, but the reconstruction part is harder. All of 
our sampling and reconstruction formulas in Chapter 8 were based on samples taken 
in equal increments of T, the sampling rate. If we sample adaptively as discussed 
above, then we may have holes in our string of regularly spaced samples, or they may 
not even be spaced on a regular grid at all. We will need new methods to reconstruct 
from nonuniform samples; these are discussed in this chapter and the next. 


9.1 .2 Trading Aliasing for Naisa 

All of the signal processing theory in the preceding chapters has been built on the 
assumption that our samples are regularly spaced by an equal amount. In ID, each 
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sample was separated from the next by the sampling interval T. In 2D, each sample 
was T x units from its left and right neighbors, and T y units from its neighbors above 
and below; for a square grid, T x = T y . We saw that corresponding to this sampling 
rate is a maximum frequency that can be captured from the signal. Any energy 
belonging to frequencies above the Nyquist rate belonging to that sampling pattern 
will alias and become an inseparable part of our estimated signal. 

This is a pretty bad situation, since it is common for the signals used in computer 
graphics to include arbitrarily high frequencies. For example, an image or illumina¬ 
tion signal with a single sharp border, such as the silhouette of a sphere, the edge 
of a polygon, or the line between two squares on a black-white checkerboard, will 
contain infinitely high frequencies. The amount of energy drops off as the frequency 
goes up, but in general we don’t know where the frequency cutoff point should be. 
Much of the time we simply sample at higher and higher rates, with increasingly 
dense sets of samples, until we either meet a threshold or simply give up and stop 
sampling. 

Let’s look more closely at our theory and see what we might change in an attempt 
to make problems such as sampling the image in Figure 9.1 more tractable. We 
have assumed that our samples are instantaneous values of the signal; that is, when 
sampling a function / with a sample at ar t , the value of the sample is /(#*). We can 
imagine changing this so that the value of the sample is something else, say 

nXi+Ci 

/ /(*) (9.1) 

JXi-ti 

for some choice of c*. This approach has been explored in various ways, most 
notably with cone tracing by Amanatides [9] and beam tracing by Heckbert and 
Hanrahan [211]. Those methods have merit, but they lose the simplicity of the 
point-sampling approach. Evaluating a signal at a single point is in general a much 
easier and better-understood problem than integrating over a small region; indeed, 
many point samples can be used to evaluate an integral, as we saw in our discussion 
of Monte Carlo integration in Chapter 7. Another advantage of point sampling 
is its computational efficiency; ray tracing is a sophisticated and efficient tool for 
evaluating point samples of many signals encountered in image synthesis. 

A different generalization of the sampling theory we have seen so far is to change 
the quantization of the sample. Rather than assign a single number to the sample 
value, we could instead attach to it an interval, perhaps with an associated confidence 
function expressing our expectation of the likelihood of the various values in the 
interval. This approach has not been developed very much in computer graphics. 
The potential is interesting; perhaps we could find a way to compute rough estimates 
of the value of signals much more cheaply than high-accuracy values. Then many 
rough samples might combine to give an equally useful representation of the signal 
as a smaller number of accurate samples, at less cost. Although interval analysis has 
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PIOURI 9.2 

(a) Stochastic sampling in ID. (b) Stochastic sampling in 2D. 


appeared in computer graphics, I do not know of any reported work that explores 
this particular approach for rendering problems. 

Another alternative is to move the samples off the regular, uniform pattern. So 
rather than sample a ID signal at equal increments of T, evaluating f(t n ) = f(nT ), 
we allow the samples to fall anywhere “at random,” as in Figure 9.2(a). In computer 
graphics, this approach is called stochastic sampling . The precise meaning of random 
in this application is important, and we will return to the subject below. Stochastic 
sampling may be applied in 2D, so rather than placing samples on a rectangular or 
hexagonal grid, we allow them to fall anywhere in the domain “at random,” as in 
Figure 9.2(b). 

Stochastic sampling is thus a special form of nonuniform sampling where the 
samples are aperiodic , meaning that there is no single structure that is repeated by 
translation at equal intervals across the domain. Since there is no repeated unit, 
there is no “pattern” associated with aperiodic sampling, but rather just a single 
arrangement of samples. 

We will analyze some different aperiodic sampling distributions below, for dif¬ 
ferent interpretations of “random” when the samples are placed. The major result 
of that section will be that if the samples are placed randomly in the domain, the 
highly structured aliasing artifacts that intrude in uniformly sampled signals turn 
into high-frequency noise . The resulting reconstructed signal is still wrong, in the 
sense that we have not captured all the high-frequency information contained in 
the original signal, but the nature of the resulting problem has changed. Figure 9.3 
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FIOURI 9.3 

(a) The function f(x) = sin(27r(4ar) 2 ) over the interval [0,1]. (b) f(x) sampled on a 200 x 200 
grid, (c) f(x) sampled on a uniform 40 x 40 grid; notice the aliasing on the right side, (d) f(x) 
sampled on a jittered 40 x 40 grid; the regular aliasing artifacts have been turned into noise artifacts. 
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shows a comparison of the same image sampled uniformly and stochastically, with 
the same number of samples. The regular structures that come from the sampling 
have been turned into high-frequency noise. 

Whether this is an advantage or not depends on the intended use of the sampled 
signal. If it is an image to be shown to a person, the noisy version is often superior 
because the human visual system is able to ignore a surprising amount of noise of 
different characteristics, while it spots structured aliasing artifacts quite easily. Thus 
a noisy picture can “look better” than one that is aliased, though neither is more 
accurate than the other in terms of the numerical quality of match to the underlying 
image signal. If you are creating an image for the purpose of analysis, either manually 
or by machine, then the noise may prove to be a greater impediment than organized 
aliasing errors. For these signals and others, such as illumination functions, you must 
consider what information is desired from the signal and how it will be used. But for 
many applications, stochastic sampling represents a powerful way to sample a signal 
that contains arbitrarily high frequencies without incurring the risk of introducing 
new structures that appear to be part of the signal. We will return to this subject in 
more detail later on. 


9.1.3 Summary 

This introduction has presented two main points. First, adaptive sampling is intu¬ 
itively appealing, because it allows us to put more samples where they are needed, 
and fewer samples in smooth areas of the signal. Second, though we have not yet 
proven it, stochastic sampling allows us to trade regular aliasing artifacts for high- 
frequency noise, which is sometimes more acceptable to the human visual system 
when looking at images. 

We need to be careful when creating noisy pictures. If the noise is of low frequency, 
then it can create ugly artifacts that are just as objectionable as aliasing. As a general 
rule of thumb for images, if the noise is beyond the Nyquist frequency, then it will 
look better than aliases from those frequencies. 

The sections below will examine nonuniform sampling and reconstruction in 
more detail. 


9.2 Nenunifomt Sampling 

There are many types of nonuniform sampling methods. Some are based on a 
prototype tile, which is a small precomputed pattern that is replicated (with trans¬ 
formations) over the domain. Others are based on generating sample locations on 
demand as the sampling proceeds. 

We will explore these in turn. 
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9.2.1 Adaptive Sampling 

As discussed above, many of the signals used in computer graphics have some regions 
of high complexity and detail, and other regions that are simple, in the sense that 
the signal is smooth or even constant within the region. Adaptive sampling methods 
attempt to put samples where they will do the most good by concentrating them in 
complex regions and leaving them sparse in simple regions. 

We are rather lucky in computer graphics to have the freedom to take samples 
anywhere we want, and then go back and revisit the signal to gather more samples. 
Many other disciplines that use signal-processing techniques do not have this free¬ 
dom. For example, the telephone company needs to sample and reconstruct audio 
signals in real time from an audio stream that only comes by once. A similar situa¬ 
tion holds true for television transmission and reception; in a TV receiver, the video 
signals comes off the antenna and into the set, and there is no chance to go back and 
pick up missed pieces of the signal. 

In signal-processing terms, the idea is to adapt the local sampling rate to the local 
bandwidth in a given part of the image. Although we have been discussing an image 
as a concrete example, these issues apply to any signal where the local bandwidth 
varies more than a little. We can think of the local bandwidth as the result of taking 
a short-term Fourier transform of the signal in some narrow region, as discussed in 
Chapter 6. Recall that in that chapter we pointed out some problems with the STFT. 
But we will use this idea mostly as a conceptual tool in this chapter and not for actual 
computation. We will see various practical means for estimating local bandwidth in 
the section on reconstruction below and in Chapter 10. 

Adaptive sampling is often implemented by first sampling with a base pattern of 
some predetermined density. This creates a set of base samples , which form our first 
estimate of the image. Typically, we examine these samples to determine if we need 
to take more samples to refine our estimate. We evaluate the samples with respect to 
some refinement criteria . If we decide more samples are required, then we invoke a 
refinement strategy that creates and evaluates additional samples in the region. This 
step may only be applied once, resulting in a two-stage refinement strategy . More 
generally, the refinement step may be applied iteratively until the criteria are met 
or some upper limit on the number of iterations is exceeded, in effect refining our 
sampling of the signal. This sampling technique is called adaptive refinement . 

The central issues in adaptive refinement are how to estimate the local bandwidth 
of the signal, where to place new samples, and what criteria to use to terminate the 
refinement process. 

Kirk and Arvo have pointed out [246] that the most straightforward form of 
refinement procedure tends to introduce a bias into the final result. The problem 
is that there is a subtle connection between the values of the base samples, the 
refinement test, and the value of the final set of samples. 

They demonstrate this with an elegant argument presented below. The departure 
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EstimateMean [S, T) 
S<-{X u X 2 ,...,X n } 

Draw n samples from signal X. 

if quality(Sn) > T 

Do the initial samples meet our criteria? 

then 
fi <- S 

Yes , the easy case. Use the sample mean. 

else 

/i <- TrueMean ( S ) 

No, the hard case. Invoke the oracle. 

endif 

return (fi) 

end 



Simple but biased adaptive refinement. 



point is the pseudocode shown in Figure 9.4 that shows a simple form of adaptive 
sampling. We will assume that our goal is to find the mean value of a signal within 
some region; this is often what we are after in image synthesis. 

The refinement function is called with an initial set of n base samples S t , and 
some quality condition T. If the samples meet the quality condition, then we use the 
mean of those samples as our estimate of the true mean. Otherwise, we invoke some 
more costly procedure to get a better estimate of the mean. Kirk and Arvo refer to 
this step as “invoking an oracle,” which is meant to imply that the second procedure 
is expensive but perfectly accurate and always returns the true mean. This oracle 
is typically approximated by taking many additional samples and averaging them, 
though other implementations are possible. 

The procedure of Figure 9.4 is biased , meaning that we can expect the presence of 
some consistent, repeatable error from any input. To isolate the bias in the technique, 
we will construct a simple test case. Imagine that we are sampling a square domain 
with area 1 made up of k vertical stripes of different widths, as in Figure 9.5. Each 
stripe i has area and constant intensity value /*. The true mean value / of this 
signal is then 

k 

7 = £/ fcWfc (9.2) 

t = l 

(because the square has unit area and the stripes tile the square, the weights are 
normalized; that is, Yli=i w < — 1)* 

Let’s now find the value returned by the algorithm in Figure 9.4. We will assign 
probabilities P[easy] and P[hard] to our chances of getting an easy or hard set of 
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FIGURE 9.5 

The geometry for analyzing bias. 


samples, and expected values E[n casy ] and E[/i hard] to the value of the mean computed 
for the two cases. Note that P[hard] = 1 — P[easy]. The expected value of the 
algorithm is then the two expected values weighted by their respective probabilities: 


B\p] — Pl^casyl x E[[l eaS y] 

+ f^hard] X P[^ hard ] (9.3) 

This problem fits the standard probability model for repeated Bernoulli trials with 
multiple results (discussed in Appendix B). For this type of problem, we organize 
our samples by how many strike each region. Thus we might find that mi samples 
strike region 1, m 2 samples strike region 2, and so on. The theory tells us that the 
probability of getting these rrii strikes from n samples from k results with weights 
Wi is given by Equation B.12: 

= —:—y-rWi mi w 2 m2 ■ ■ ■ w k mk (9.4) 

mi! m2' • • • mfc! 

For any particular input, we can find E[n ] from Equation 9.4 for all possible out¬ 
comes (mi, m 2 ,..., m,k) and then weight the results by the corresponding probabil¬ 
ities P[easy] and P[hard]. 

We would like to find a more useful form of Equation 9.3. We begin by filling 
in the placeholder for the easy case with real values. Let us assume that the quality 
test requires that all of the n samples have the same value. Then in this easy case, 
all but one mi in Equation 9.4 goes to zero (since only one type of event can occur), 
n — mi, and we get 


P[easy] = w\ n + w 2 n H-h Wk n 


(9.5) 
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We now turn to the expected value from this result. We know that all n values are 
the same, but we can’t predict which value they will all take on. We can find the 
expected value as a sum of the individual stripe values, normalized by the weights 
and taking into account our n samples: 


i?[easy] = 


W\ n I\ + W2 n h H-h W k n I n 

W\ n + W2 n H-h W}c n 


(9.6) 


Substitute these results into Equation 9.3, and recalling P[hard] = 1 — P[easy], 
we find 


k 


E[tA = + -M) 

i=1 


(9.7) 


Since the true mean is given by /z, the right-hand term of Equation 9.7 represents an 
error in our expected value that will usually not be zero; this is the bias. 

This bias is introduced because the statistics of the initial sample influence both 
our test and our calculation of the final value. The problem is somewhat like 
the situation when a number of unlikely events all happen simultaneously or in 
quick succession. For example, suppose you’re playing a game of cards, and you 
draw three sevens in a row. You might think that you’ve experienced something 
remarkable, compute the (low) probability that you would draw three sevens in a 
row, and wonder at the unlikely event. This is very human, but a little deceptive. 
Unlikely events happen all the time; after all, someone almost always wins every 
lottery drawing. The right way to view the amount of surprise in a situation is to 
ask yourself beforehand if that situation is likely to occur; then you have correctly 
set yourself up to react if it does come about. In the sampling case, the trick is to ask 
questions about the samples before they are drawn, not to analyze them afterward. 

To remove the bias, we need to eliminate this relationship between the evaluated 
samples and decisions made on their likelihood. Kirk and Arvo present a modified 
algorithm, shown in Figure 9.6, that avoids the bias problem. 

The basic idea is that we first draw a “pilot” set of samples from some region 
R contained within X . Those samples are used to estimate the mean of the signal 
within R only. We then test those samples; if they are sufficiently different, then we 
will take ri 2 samples from the remaining domain X — R. Otherwise, we will take rt\ 
samples from this region. Typically, ri 2 ni, so we take many samples when the 
pilot set is diverse, and only a few when it is more uniform. These new values are 
then used to estimate the mean in X — R. The two means are then added together, 
weighted by the relative sizes of the two domains. 

Kirk and Arvo note that this method will typically take too many or too few 
samples. In smooth regions, we would like R to be small, so that only a few samples 
need to be taken to confirm that the signal is easy in this region. When the signal is 
very complex, we want a large R in order to capture that complexity and trigger the 
higher-density sampling. Since we don’t know the signal, we must take a guess for 



380 


9 NONUNIFORM SAMPLING AND RECONSTRUCTION 


UnbiasedEstimateMean (5, ni, n 2 , T) 


S p ^{X 1 ,X 2 ,...,X p }cRcX 

Draw p samples from region R in X. 

if quality(Sp) >T 
then n «— n\ 
else n <— ri2 

endif 

Do we need only a few more samples or many* 

S n ^{X 1 ,X 2 ,...,X n }cX-R 

Take more samples from unsampled region of 

fJ*p <r~ S p * |/?| 

P>n S n • \X — R\ 

Find means for each sample set . 

fl i flp “h H n 

Get the combined mean and return it. 

end 



FIGURI 9.6 

Unbiased adaptive refinement. 


R. If we guess too low, then the initial samples don’t contribute much to the final 
mean, and they become less valuable. If we guess too high, then we take superfluous 
samples when the region is smooth. 

Correcting for bias is an important theoretical point, but it is not clear at present 
how much it affects practical problems in computer graphics. The cost can be 
nontrivial, as mentioned above, since we never want to waste even a single sample. 

There have not yet been any published analyses indicating how much bias is 
tolerable in different parts of the rendering process. The best approach is probably 
a conservative one: we should avoid bias whenever possible. 

In the next chapter we will see a variety of techniques that have been proposed 
for carrying out adaptive sampling. There are many different refinement criteria 
used to trigger higher-density sampling, and different means for placing samples, but 
they all boil down to evaluating a coarse distribution of samples to estimate the local 
bandwidth of the signal, and then taking additional samples in regions where the 
bandwidth is high. 

We mentioned earlier that nonuniform sampling allows us to trade structured 
aliasing artifacts for noise. Consider that an aliasing problem arises because high 
frequencies beyond the Nyquist limit overlap with the original spectrum of the signal. 
These higher frequencies represent periodic sine and cosine signals and they combine 
to create what appear to be new structures in our reconstructed signal. In other 
words, the effect of aliasing isn’t to simply make the reconstructed signal a poor 
approximation of the original; it makes the signal wrong in highly structured ways. 
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For an image, these structures are particularly bad because the human visual system is 
very good at detecting them. Even in more abstract situations such as shading, some 
algorithms may be particularly sensitive to structure in the original signal, which 
can damage the numerical accuracy of the simulation and ultimately cause visual 
artifacts such as erroneous shadows or highlights. A high-frequency problem (noise) 
is less likely to cause these large structured problems than a low-frequency problem 
(structured aliasing). High-frequency foldover is still aliasing, but we distinguish it 
with the term noise because of its unstructured effect on the signal. 


9.2.2 Aperiodic Sampling 

The point of this section is to show that when samples are taken aperiodically, then 
we can completely eliminate structured aliasing. Since the resulting sampled signal 
isn’t perfect, the error must go somewhere, and we show that it goes into noise (or 
unstructured aliasing). By choosing our sampling pattern carefully, we can choose 
how much of that noise is distributed into different frequencies. Since the human 
visual system is more tolerant of high- than low-frequency noise, we will typically 
want to push our sampling errors into the highest frequencies that we can achieve. 

The characteristics of aperiodic sampling that distinguish it from periodic sam¬ 
pling are the increased likelihood of sampling all regular structures in the signal, and 
the transformation of coherent aliasing into incoherent noise. Both of these can be 
beneficial in some circumstances. 

We mentioned above that nonuniform sampling can trade periodic structures 
(aliases) for noise. We can derive this important result by looking at the Fourier 
transform of an appropriate aperiodic sequence. The basic theory for this analysis 
was presented by Leneman in a series of papers in the late 1960s [262-265], which 
were applied to graphics byJQipp^JI^ This section will give an overview 

of the derivation of the basic results leading to the spectral characteristics of a signal 
sampled with a nonuniform pattern. We will follow the developments as described 
in references [124] and [265]. 

A few definitions will help us get started. An impulse process may be considered 
a train of pulses characterized by some statistical parameters (note that the term 
process is used here as roughly a synonym for sequence). For example, we will focus 
on the impulse process s(t) given by 

oo 

S (t) = a **(*») (9 - 8 > 

£= —oo 

where the values {t n } are not uniform. This creates a series of impulses that are not 
spaced at equal intervals. 

If one or more of these pulses is missing, we call it a skip process . If the amplitudes 
a n are unequal, then it is called a weighted process . We will limit our attention in 
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this book to processes where all a n = 1. Initially we will assume there are no skips; 
we will relax this requirement later on. 

It is difficult to directly take the Fourier transform of a class of statistical signals, 
because we would have to choose a single instance of the parameters to create a 
single, specific signal. This would defeat our intention to retain the statistical nature 
of the signal in the transform. It is better to try to work with the collection of all 
signals specified by a set of parameters, called the ensemble . Studying properties of 
the ensemble tells us what to expea statistically from signals that belong to it. 

A basic tool for studying statistical signals is the autocorrelation . The autocor¬ 
relation of a signal / is written R/(t) (or sometimes C(t)). The autocorrelation 
tells us how well a signal overlaps with itself when shifted. The idea is to move the 
signal left or right by a given amount £, multiply the shifted signal with the original, 
unshifted signal, and then find the expected value of the result. For example, sup¬ 
pose the signal is a constant. Then the autocorrelation is also a constant: for every 
shift of the signal, it lines up with itself perfealy. A sine wave of period 27 t has an 
autocorrelation of 1 for t = 0, but a value of 0 at t = 7r, when the shifted signal 
exactly cancels the original. We can write the autocorrelation for any shift t as the 
expected value of the signal multiplied by its shifted version: 

R f (t) = E[f(t + r)f(t)) (9.9) 

We note that the cross-correlation of two signals / and g is defined similarly: 

R f9 (t) = E[f(t + T)g(t)} (9.10) 

The Fourier transform of the autocorrelation is called the power spectral density of 
the function (PSD, or spectral density ). We will represent the PSD of a function / 
as \P/(u;). Intuitively, the PSD expresses the Fourier transform of the ensemble of 
signals, though each individual instance may vary. 

The autocorrelation of s(t) in Equation 9.8 is given by Leneman [265]: 

R s (t) = 0(6(t) + P) (9.11) 

where /? is the average number of impulses per unit interval of time. The PSD for s 
is given by 

= F{Rs(t)} = 0[1 + 27(9.12) 

Equation 9.12 tells us that the transform of our nonuniform pulse train is a flat 
spectrum of noise of constant amplitude /?, with a single spike at u>. What happens 
when we sample a signal with this nonuniform train of impulses? 

We can’t simply convolve this with the Fourier transform of a signal, because this 
is a PSD. Dippe and Wold suggested that to do the analysis, we consider a hypothetical 
class of signals / which is constructed so that we can find its autocorrelation Rf y 
and thus its PSD /. Then the sampling operation is represented by multiplying the 
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autocorrelations, giving us a sampled signal g (here described by its autocorrelation 
R g ): 

R g {t) = Rf(t)R s (t) (9.13) 

This multiplication in the time domain turns into convolution in the frequency do¬ 
main. We don’t know V f (u;), but we can use our result for (w) from Equation 9.12 
above: 


Vg(uj) = yf(uj)*V S (uj) 

= \P/(a;) * 0[1 + 27 t/3S(uj)\ 

= 0j*, M ^ + 2, 

= /3k p + (2n/3 2 )'i! f(u>) (9.14) 

Equation 9.14 is very important: it shows that our sampled signal is completely free 
of aliasing'. 

Equation 9.14 says that the result of sampling a signal / with an impulse train 
s(t) is a signal whose PSD consists of a flat sea of noise of amplitude 0k p , and a 
single copy of the PSD of the signal, scaled by 2n(3 2 . 

The most significant point of the discussion so far is that Equation 9.14 contains 
only one copy of the transform of the signal. Recall that when we sampled with 
a uniform impulse train, we created endless copies of the original spectrum in the 
sampled signal. We called these aliases, and we needed to design and apply filters 
that would remove these aliases without introducing new artifacts or removing useful 
information. On the other hand, when we sample with a set of nonuniform impulses, 
we get no aliasing at all. Instead, we get a sea of noise surrounding a single copy 
of the PSD of the image. Most images are not appropriate for this analysis, but the 
results encourage us to find practical methods for eliminating aliasing by using a 
judicial choice of the sampling pattern [124]. 

Dippe and Wold give a practical example of this analysis for a sampling pattern 
where the impulses are at time £*, given by 


£fc+l = tic + dk 


(9.15) 


where the dk are increments that are generated by an exponential distribution. We 
will also impose a minimum-distance constraint , which we will find useful in later 
chapters. We require impulses to be separated by a minimum distance do: 


dk ~ p(dk) 


■{; 


(9.16) 


ae ~ Q ( dk ~ d °) dk > do 

. 0 otherwise 

where ~ indicates that dk is drawn from the distribution given by p(<4). The average 
distance between samples is given by the expected value of dk : 


E[dk] = a/( 1 + ado) 


(9.17) 
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When do = 0, then this is uniformly distributed random sampling with an average 
density /? = a. When a —> oo, the pattern becomes a uniform pattern with sampling 
rate /3 = 1/do* The resulting power spectrum for this pattern [124] is 


0 


= 


i - 


2au> sin(do^) 4- 2a 2 cos(do^) — 2a 2 


2aa>sin(doc<;) — 2cr cos(do^) + ur + 2a 2 J 


U 7^ 0 


(9.18) 


2'k0 2 5(uj) 


uj = 0 


Figure 9.7 shows plots of Equation 9.18 for different values of doa. Notice that by 
increasing cfoa, we can reduce the amount of low-frequency noise in the pattern and 
transfer that energy to higher frequencies. 

We now turn to jittered sampling. We will discuss this in more detail lately but 
the basic idea is that we generate a series of regular impulses, and then move each 
one left or right from its original position [27]. We can write a jittered impulse train 
as 

s(t) = '%26{nT + u n ) (9.19) 

n 

where the u n are independent, uniformly distributed values drawn from the distri¬ 
bution p(ui t). 

Leneman [265] uses the symbol 7(0;) to stand for the Fourier transform of p{u k ) 
defined in Equation 9.16. Dippe and Wold [124] show that for this type of sampling 
pattern, the PSD is given by 


00 

*.(«) = /?[!- |7M| 2 ] + 27r/? 2 | 7 (u,)| 2 £ S(lj - k2np) (9.20) 

k= — oo 


The term on the left is flat noise, as we saw in Equation 9.14. But on the right we see 
an endless number of impulses; this will lead to endless copies of the signal spectrum 
if we sample with this pattern, so it appears that we’re back to aliasing. 

But all is not lost. Notice that the delta functions are modulated by the jitter 
transform \ If we choose the distribution pattern carefully, then we can make 

this function go to zero exactly where the impulses appear. If we use a rectangular 
distribution for p(ujt), then its Fourier transform will be a sine function. By matching 
the width of the box to the interval 27r/J, we can cancel out the impulses. Specifically, 
if p(uk) is a box over the interval [-1/2/3,1/2/3]: 



-1/2/3 < u k < 1/2/3 

otherwise 


(9.21) 


then 


7 (uj) = sine(u;/2/3) 


(9.22) 
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FlOUftl 9.7 

Equation 9.18 plotted for (a) doa = 0.2, (b) doa = 0.5, and (c) doa = 0.9. 


so the resulting sampling spectrum is 


\P s (u;) = /? [ 1 — sinc 2 (u;/2/?)] + 2tt(3 2 6(uj) (9.23) 


This PSD is shown in Figure 9.8. Note that the periodic impulses are exactly canceled 
out. 

Thus sampling with jitter distributed according to a p(uk) such as that in Equa¬ 
tion 9.21 gives us a sampling pattern that completely avoids aliasing! Because they 
avoid aliasing, nonuniform sampling patterns have proven to be very useful in prac¬ 
tical rendering systems. 





386 


9 NONUNIFORM SAMPLING AND RECONSTRUCTION 



FIOURI 9.9 

A plot of Equation 9.23. 


9.2.3 Sampling Pattern Comparison 

One way to compare the quality of different sampling patterns is by computing the 
power spectral distribution of the ensemble specified by a given process. Another 
approach was suggested by Chen and Allebach [81], who compare patterns based 
on the mean-squared error between the sampled signal and an original. Since the 
error depends both on the sample locations and the signal, if we want to compare 
the patterns alone, the dependence on the signal must somehow be factored out. 
They do this by considering the maximum of the mean-squared error over a class of 
signals. 

Chen and Allebach call each pattern a point set , and notate it as Xi. The general 
approach is similar to what we have seen before; the reconstructed function is a 
linear combination of the N samples and a reconstruction filter <t>: 

N 

f(x ) = ^2 bi<t>{x - Xi) (9.24) 

i=0 


In matrix form, we can write 


f = #b 


(9.25) 
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or, in tableau, 

<t>{x l-Xi) <f>(x 2 -x i) ••• <j)(x N -X i) 

<P(x 2 -x i) : 

(j)(XN—X i) ••• ••• <1>(xn — Xn) 

Each point set has a corresponding point set matrix , Each element of 

matrix $ is defined by 

$/,m = $(|x m -x/|) (9.27) 

where \x m - xi\ is the distance between the two sample locations m and Z, and the 
function </> is the reconstruction filter as in Equation 9.25. Note that the point set 
matrix is in one-to-one correspondence with the point set. 

For example, when $ is the 2D separable sine function, 

= sinc[U(x/ - x m )] sine [V(j/i - y m )] (9.28) 

When U and V are orthogonal vectors, and the samples are uniformly spaced, then 
Equation 9.25 coincides with the uniform reconstruction theorem. 

Chen and Allebach proposed that the quality of the point set matrix (and thus the 
point sets themselves) may be measured by the mean square average of the quality 
of a signal sampled with that point set over many signals of the same energy. They 
present a minimax argument for ranking two matrices [81]. Suppose we have a 
function A which accepts a matrix as an argument and returns the largest eigenvalue 
in the matrix. Then matrix is preferred over matrix &b if A($^) < A($a). 
This argument may be extended to sort any number of matrices by preference. 

They note that for large matrices, it may be prohibitively expensive to compute 
the eigenvalues. They offer two other “goodness criteria” that approximate the 
eigenvalue measurement. The first says that matrix $>a is preferred over matrix &b 
if \\&a\\ 2 < ||*b|| 2 . Here, \\&a\\ is the Euclidean norm of the matrix: 

< 9 - 29 ) 

1=0 m =0 

The other alternative measurement is that matrix is preferred over matrix if 
Q(&a) < Q(^b)- Here the function Q is given by 

Q(&) = ma xQi 
i 

= £ 1**1 


bo ' 
b\ 

. (9.26) 

bn 



(9.30) 
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Chen and Allebach found that the rankings produced by the two approximate 
goodness measures were roughly similar to the more expensive eigenvalue measure. 

One way to understand these measures is to think about how they classify some 
well-understood matrices. For example, suppose matrix &a is the identity matrix, 
and matrix &b is a matrix of all Is. The identity matrix corresponds to completely 
uniform samples spaced at multiples of the Nyquist interval, using the definition of 
$ in 9.28. The matrix of all Is corresponds to all the samples superimposed on each 
other in one place (there is 0 distance between any two of them). In these extreme 
examples the measures all prefer the uniform matrix A to the degenerate single-point 
matrix B. 


9.3 Informed Sampling 

Now that we have freed our samples from an underlying grid, how might we dis¬ 
tribute them to our best advantage? There are two popular techniques for placing 
samples to reduce the number needed for a good estimate of the signal. They are 
both based on our knowing something about the characteristics of the signal before 
we begin. This may seem an unreasonable demand; after all, if we knew much about 
the signal before sampling, then we might not need to sample it at all. This is true, 
but we are usually in a situation between perfect ignorance and perfect wisdom, and 
have some general but incomplete knowledge of the signal that we can use to our 
advantage. 

The two methods are based on ideas very similar to the placement of sample points 
for Monte Carlo integration in Chapter 7, so we will use the same terminology as in 
that chapter. The first method is called stratified sampling. The idea here is to break 
up the domain into disjoint regions, and then place one sample into each region. This 
prevents our samples from all clumping together in one place and missing big pieces 
of the domain. The second method is called importance sampling , and basically 
tries to direct our sampling process to take more samples in regions where the signal 
has a large value (and thus makes an important contribution to our estimate), while 
conserving samples where the signal is small (and therefore less important). 


9.4 Stratified Sampling 

The method of stratification is based on the realization that if we simply take samples 
of a signal at random places in a domain, we could be very unlucky and the first 
n samples might all land in the same region, causing them to clump together, as in 
Figure 9.9. 

If the samples are really placed “at random,” with uniformly distributed random 
numbers, then eventually we expect them to cover the domain uniformly. The 



9.4 Stratified Sampling 


389 



i mm iii- 

, .I 1,1 t, 1 ,11, 1 i , 



.2 .4 .6 .8 1 

(a) 

.2 .4 .6 .8 1 

(b) 



PIOURI 9.9 

(a) The first eight samples are all clumped together, (b) The first eight samples are rather well 
distributed. 



PI8UII 9*10 

A domain broken up into strata. 


problem is that we want a pretty uniform distribution right away, so that our first 
few samples give us a good estimate of what’s happening everywhere in the domain. 

To this end, we break the domain into strata , or regions that fill the domain 
without gaps or overlaps, as in Figure 9.10. The strata need not all be the same size 
or shape. 

Mathematically, we are breaking up our signal into a sum of distinct, independent 
signals. Suppose we have a ID signal f(t) over the interval [0,1], and we break up 
the domain into four equal-sized intervals, as in Figure 9.11. Each of these intervals 
then will receive one sample. We are effectively decomposing / into four functions, 
each defined over an interval of width 1 /4: 

/W = £/WVi(np) (9.31) 

2=0 ' ' 

where &i/ 4 (£) is a box of width 1/4 centered at t. 
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FIOURI 9.11 

A signal in the interval [0,1] broken into four equal strata. 




FIOURI 9.13 

A poor choice of strata can allow samples to clump together, (a) A stratification of a square, (b) An 
unlucky sample placement that defeats this stratification. 


When a domain has been stratified, our base (or starting) sampling will usually 
take one sample within each stratum. The samples can still clump together locally, 
as in Figure 9.9, but they can’t all land in roughly the same area unless the strata are 
poorly designed and allow this to happen. An example of a poor decimation of a 
2D domain is shown in Figure 9.12, where the strata form wedges that all meet at a 
common vertex. In this example all of our samples could still end up near the center. 
Thus it isn’t enough just to stratify the domain; the stratification must be designed 
to enforce a roughly uniform distribution of points. 
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(a) A common stratification for a square domain, (b) A common stratification for a circular domain, 
(c) An alternate circular stratification due to Shirley [402]. 


Stratification for square and circular domains may be based on rectangular and 
polar coordinate systems, as shown in Figure 9.13. This figure also shows a circular 
stratification due to Shirley [402]. 

The big advantage of stratified sampling as presented so far over purely random 
sampling is that we are guaranteed that the samples are not all clumped together 
in one place. The big disadvantage is that we must decide, before sampling, the 
number of strata and their shapes. This can be a difficult decision, particularly when 
we want to start with a sparse sampling density and gradually increase it. 

The method of adaptive stratified sampling has been developed to ameliorate this 
problem. The method depends on an auxiliary data structure that is maintained 
during sampling, so we will defer a complete description until Chapter 10 when we 
survey practical sampling techniques. The basic idea is that we can start with an 
initial stratification of a domain, which may be as sparse as desired. If we want to 
take one more sample, then we can choose a stratum and split it, as in Figure 9.14. 

The sample we have already evaluated will fall into one of the the two newly 
created strata, so we can generate a sample in the other stratum and evaluate it. The 
process may be repeated as many times as necessary to fulfill the refinement criteria. 
Important issues in this algorithm involve choosing which stratum to split, and how 
to split it. 
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New sample 


FIOURI 9*14 

Splitting a stratum. 


9.4.1 InportancG SanpHng 

Another way to make our samples count is try to make sure that each one represents 
a useful quantity of information. We could define information in a formal sense as in 
information theory [19], but that would take us rather far afield, and there is instead 
an intuitive definition that works well. 

Consider the ID signal shown in Figure 9.15(a). In the range the value 

of the signal varies, but in general it is much smaller than the values in the interval 
[£3^4]* 

Suppose the signal of Figure 9.15(a) represents energy (such as the light energy 
striking a point on an image plane or object surface), and we want to sample the 
signal with N samples. 

We assert that the best distribution of N samples would be one where each 
sample represents an equal amount of energy. If we have a plot of the signal, as in 
Figure 9.15(a), then we can break it up into a series of abutting rectangles, where 
the height of each rectangle is the value of the function at its center, and its width is 
such that the total area of each rectangle is the same, as in Figure 9.15(b). Placing 
the samples in such a way that each one represents the same amount of energy is 
called importance sampling . The term may be thought of as indicating that each 
sample has the same importance, or that when measured linearly along the t axis in 
the figure, there are more samples per unit area where the signal has a large value, 
and therefore there are more samples in “important” parts of the signal. 

It can legitimately be claimed that regions where the signal is 0 are every bit as 
important as where it is not. But since every sample is expensive, we want every 
sample to contribute as much as possible to our knowledge of the integral. Consider 
a signal over the domain [0,1] that is zero in the left half [0, .5] and a linear ramp 





riOURI 9.15 

(a) A ID signal, (b) The signal divided into N regions of equal energy. 
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MOURI 9.16 

A half-flat, half-ramp signal. 


from 0 to 1 in the right half [.5,1], as in Figure 9.16. The integral of this signal is 
0.25. Now suppose that although this is a trival signal, it’s actually the result of some 
very expensive simulation and every sample comes at great cost. If we take a whole 
bunch of samples in the left half, they are all going to be 0 and will not contribute to 
the integral. It’s important to look at the left half, of course, but if we know that it’s 
0 (or even just small compared to the right half), then most of the contribution to 
the integral will come from the right half. We are better served by placing samples 
in the “important” regions when we want the best answers as quickly as possible. 

Note that in importance sampling the signal has been divided up into equal areas, 
not equal parametric intervals. 

As with stratified sampling, this approach may seem difficult to implement be¬ 
cause it requires knowing the function in advance before we can decide where the 
samples should be placed. This time it’s even harder to get information, though, 
because instead of dividing up the domain of the signal (which we usually have some 
knowledge of), we need to divide up the integral of the signal (which is just what 
we’re trying to find). 

If that’s all we knew about the signal, then importance sampling might not get us 
very far. But in computer graphics we almost never seek just the integral of a signal; 
there is almost always another function applied to that signal to modulate it. For 
example, on the image plane there’s the pixel-based reconstruction filter on top of 
each sample, and when the signal is an illumination signal at a point, the reflectance 
function of the object at that point is applied to the illumination. As these examples 
illustrate, the signal s(t) is usually modulated by some filter function f(t) to produce 
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a composite signal s(t)f(t). And in general, we know (or can get access to) f(t) even 
when s(t) is completely mysterious. 

The most common way to implement importance sampling in computer graphics 
is to use the filter function f(t) as an initial guess to the composite s(t)f(t ). That is, 
we use the filter function to control the sample density over the domain. The intuition 
is that if the signal is bounded over the support of the filter, then the product of the 
signal and filter will lie below the filter curve scaled by some constant. In other 
words, the hope is that the filtered signal will have about the same shape as the filter. 
Figure 9.17 illustrates this idea. 

This approach is very attractive for both image-plane sampling and illumination- 
sphere sampling. In both cases we know the filter function, and we can preprocess 
it into a sum table [111]. Using a sum table, we can quickly find the total volume 
under any rectangular region of the filter’s domain. 

To illustrate its use, suppose the filter f[n] is defined in ID (or is one component 
of a separable 2D filter). Then we precompute the sum table F[n \: 

n 

t=l 

Suppose the filter table has N entries. To divide the filter into two regions of 
equal energy, we find the point n 2 where F[n 2 ] = F[N]/2. Similarly, to divide any 
region [a, b] into two equal-energy regions, we split at the point n a where F[n a ] = 
(F[b] — F[a])/2. 

Although this model of using the filter to approximate the signal for the purpose 
of importance sampling can work well, the model can break down, as Figure 9.18 
shows. If the signal is too large in some areas, then the filter must be scaled so much 
that the difference between the scaled filter’s minimum and maximum values is very 
small compared to the filtered signal’s average value. 

It is not unusual for our signals to exhibit this kind of behavior. Consider when 
we want to estimate the illumination function falling on some point on an object’s 
surface in space. In theory, even a small object far away can reflect a tremendous 
amount of light toward the point receiving the illumination; think of the sun glinting 
off of an automobile’s outside rear-view mirror on a sunny day. In general, we cannot 
predict from where bright light will arrive; after all, that is the problem we’re trying 
to solve when estimating illumination. 


9.4.3 Importance and Strat ifi ed Sampling 

Stratified sampling and importance sampling can be combined to make a technique 
more powerful than either one alone. The basic idea is to build strata that represent 
equal-energy portions of the signal. 
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MOVII 9.17 

(a) A signal s{t). (b) A filter f{t). (c) The product s{t)f(t) lies under the scaled filter Cf{t). 
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(a) A signal s(t). (b) A filter f(t). (c) The product s(t)f(t) does not lie under the scaled filter 
Cf(t) for a useful value of C. 


Generating points that conform to different filter functions can be challenging. 
Shirley [400] summarizes how to sample differently shaped domains with uniformly 
distributed random variables. Some of the most useful transformations are given in 
Table 9.1. The complete list may be found in [400], along with a description of the 
technique for generating additional transformations. 
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Target space 

Density 

Domain 

Transformation 

Disk of 

radius R 


9 e [0,2ir] 
r 6 (0, ft] 

0 = 2iru 

r = Ry/v 

Triangle with 

vertices 

ao,ai,a2 

p(a) = — 

area 

a 6 [0,1] 
t € [0,1 - s] 

s = 1 — y/l — u 

t = (1 — s) V 

a = ao + s(ai - ao) + £(a 2 - ao) 

Surface of 

unit sphere 

P(e,4>) = ^~ 

47r 

9 6 [0, 2ir] 

4> € [0,2tt] 

9 = cos _1 (l — 2 u) 

4> = 27 XV 

Sector on 

surface of 

unit sphere 

p(0,4>) = 

1 

{<t>2 ~ <t>\) [COS(01) - COs(0 2 )] 

6> e [«1,6> 2 ] 

</> e 2 ] 

9 = cos~ l (g) 

g = cos(0i) + ti(cos(0 2 ) — cos(0i)) 

<t>= <t> 1 + v(<t> 2 ~ <t> 1 ) 


TABLI 9.1 

Transformations from the uniform unit random variables (u, v) to various domains. Source: Data 
from Shirley in Graphics Gems ///, table 1, p. 81. 


9.5 Interludes Thu Duality off Aliasing and Noise 

There is an interesting duality between aliasing and noise in an undersampled signal: 
beyond a certain point, reducing either of these conies at the expense of the other. To 
see this, we will take a detour from our main development and discuss uncertainty , 
which will lead to the noise-aliasing duality. This section is not essential to the rest 
of the book and may be skipped without harm. 

We will begin by deriving a famous principle of physics known as the Heisenberg 
uncertainty principle. It states that any pair of physical quantities coupled in a 
particular way demonstrate a reciprocal relationship with regard to the precision of 
their measurement. In other words, past some point of accuracy, the better we know 
one value, the less certain we can be of the other. 

One example is given by the position and momentum of a particle. We can mea¬ 
sure both position and momentum with increasing precision until some fundamental 
limit is reached. From there on, the uncertainly principle takes over. If we refine our 
measurement of the particle’s precision, we must necessarily become less sure of its 
momentum, and vice versa. This has nothing to do with the quality of our apparatus 
or the care we take when measuring; the product of the uncertainties is inherent in 
the structure of our universe. Another example is given by the time and energy of 
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an event in spacetime; these observations are related by the Fourier transform, and 
thus they are coupled in the same way as position and momentum. If we imagine 
that in a rendering system we have little bundles of energy traveling around on the 
backs of photons, carrying some given amount of energy from one place to another, 
then at some fundamental limit we will be unable to specify when those photons 
depart from and arrive at their source and destination. This value is far smaller than 
anything normally used in practice in computer graphics, but it does exist. 

We can derive the uncertainty principle by a purely theoretical development using 
the tools of the Fourier transform. We start with a few identities and definitions that 
will make the derivation flow more smoothly, following the development in Bracewell 
[61]. In this section, / refers to /(£), and /' refers to its first derivative, /' = df(t)/dt . 
Similarly, F refers to F(u>) = T{f(t )}, and F' is defined as F' = dF(u>)/cLd. All 
integrals in this section have infinite limits. 

We start by combining the differentiation property of Fourier transforms with 
Parseval’s theorem to find an identity for (/'| /'). 

(/'I /') = (2jnujF\ 2j7ru>F) 

= 47t 2 (ujF\ ujF) (9.33) 


We next define the centered variance of any function g(t) to be 


°c{9) 2 


(tg\tg) 

(g\g) 


(9.34) 


This definition is a simplification of the general definition of the variance that also 
takes into account the first two moments of the function and its centroid. The 
centroid of a function g(x) is defined as that value of x for which the the value of g 
times the total area of g is equal to the first moment of g. In symbols, the centroid 
x c is 

/ xg(x)dx 

x c = — f - (9.35) 

/ g(x)dx 

(For more details on the centroid and various other measures related to the Fourier 
transform, see [61, pp. 135-156].) 

We note the following identity, which is the integration by parts over an infinite 
integral: 


/ tf'dtl = I f fdt 

<*!/') = <i|/> 


(9.36) 
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Last, we will use a particular form of the Cauchy-Schwarz inequality. We will 
derive this inequality by supposing we have two functions / and g , and some real 
number e. Then 

0 < (/ + €0|/ + €fl) (9.37) 

unless q = e/, in which case we need to use any e' ^ e. We can expand out the 
braket to get a quadratic in e: 

o < J (f + eg)(f + eg) dt 

< J(ff + e (s7 + f§) + ^99) dt 

<(f\f) + e((f\g) + { 9 \f)) + e 2 (g\g) (9.38) 

So we get a quadratic a + be + ce 2 . From the quadratic formula, we know that for 
this quantity to be greater than zero requires b 2 - 4ac < 0, so 

(</l<7> + (9l/» 2 <4(/|/}<s|<7> (9.39) 


We are now ready to derive the uncertainty relation. We assume that we have a 
signal / with Fourier transform F, both of which have their centroid at 0. Then by 
the definition of Equation 9.34, we write 


(At) 2 = 
(Aw) 2 = 


(t/|t/) 

(/I/) 

(ojF | ujF) 
(F\F) 


(9.40) 


These are the uncertainties in the two signals. The precision of our knowledge of 
any value /(f) is represented by At at that f, similarly our precision of the value of 
F(a>) is given by Aw. We want to find a lower limit on their product, so we write 
the product and expand: 


(At) 2 (Aw) 2 


(t/|t/)(wF|wF) 

(f\f)(F\F) 

(tfimnn 

4tt 2 (/I /) (F| F) 
4tt 2 (/| /) 2 


(9.41) 


where we have used Equation 9.33 and then Parseval’s theorem. Continuing, we 
apply Equation 9.39: 

(Af) 2 (Aw) 2 > I((f/|/') + (/'|t/)) 2 

4tt 2 </| /) 4 
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=---o ( f tff'dt+ [tff'dt] 

167 r 2 </|/) 2 \J J ) 

**«**)' 

= — —JffTdt) 2 
16 tt 2</|/) 2 V 7 ) 

where we applied Equation 9.36. Simplifying this last expression, we find 

(A1,W2 i ^un ? {fU)2 
__ 1 
167 T 2 


-<i) 


By taking the square root of both sides, we find 


(*><*") * ^ 


(9.42) 


(9.43) 

(9.44) 


This is the unitless form of Heisenberg’s uncertainty relation. When we use it in 
an application, we need to apply the appropriate units (and any necessary scaling 
factors). It tells us that the product of the uncertainty in any two parameters related 
by a Fourier transform can be no less than 1/47T, no matter how good our equipment 
is or how carefully we measure. Notice that we made no physical assumptions in our 
derivation; this is a straightforward result of the inherent uncertainty in any system. 

For example, consider the conjugate pair of measurements for time and energy. 
In 1900, Planck formulated a theory that energy is bundled up into small, indivisible 
packets called quanta , and that each quantum had an associated oscillation of a 
given frequency. When a system absorbs or emits energy, it does so in discrete steps, 
corresponding to a single quantum. Planck’s relation expresses the relationship be¬ 
tween AE , the change in energy of the system, and Au>, the change in the vibrational 
state of the system, as 

AE = hAuj (9.45) 

where the constant h « 6.62 x 10 - 27 erg-second. The value of h is one of the 
fundamental constants of our universe, similar to the speed of light or the charge on 
an electron. It is one of the constants that describes the structure of the universe we 
know. 
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If we solve for Au and plug it into Equation 9.44, we find 

AtAE > h/A'K (9.46) 

which is precisely Heisenberg’s uncertainty principle for time (measured in seconds) 
and energy (measured in ergs). It tells us that for any physical phenomenon, we can 
theoretically determine the time of that phenomenon with as much accuracy as we 
desire, but beyond a certain limit our concurrent knowledge of the energy in that 
event will become worse and worse, so the product of the uncertainties never goes 
below h/ 47r. 

If we measure the location x of a particle, and its momentum p = mv (m stands 
for mass, v for velocity), then we write Ax for the uncertainty in position and Ap 
for the uncertainty in momentum, yielding 

AxAp > /i/47t (9.47) 

which is a fundamental limit on how well we can know both the location and 
momentum of a particle simultaneously. 

Many clever experiments have been devised and carried out to test the uncertainty 
principle in practice; it has not yet been disproven. 

As promised at the start of this section, aliasing and noise in an undersampled 
signal are related by the uncertainty in just the same way as the coupled terms 
discussed above. We will follow the analysis in Resnikoff [358]. 

Consider a continuous-time ID signal / with finite power, but infinite energy. 
That is, its energy 

E = Jf 2 {t)dt (9.48) 

is not finite, but its power 

i r T ' 2 

P = lim - / f 2 (t)dt (9.49) 

T -400 1 J yy 2 

is finite. For these signals, we will focus attention on the autocorrelation function 
(which we will write as C(t) in this section) and its Fourier transform, the power 
spectral density (PSD) P(u>). 

If a signal’s PSD is flat within some interval [u;o, u>i], it is called white noise in that 
interval. Suppose that the PSD of / is a constant over all frequencies; then we know 
that the autocorrelation function C(t) of / must be a scaled impulse, since C(t) and 
P(u>) are a Fourier pair. That is, 

T^og^T f T /(“ + *)/(“) dw = 0 


(9.50) 
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for t ^ 0. Equation 9.50 tells us that / is completely uncorrelated with itself for all 
possible shifts u, except for the trivial shift u = 0, when the signal is perfectly aligned 
with itself and is perfectly correlated. If a signal is completely uncorrelated with 
itself, then it has no short- or long-term periodicity or structure; in other words, it is 
random, or noisy. This observation gives us a measure for the amount of noisiness 
in a signal; the closer the signal’s power spectral density is to a constant, the more 
noisy it is; larger nonlinearities correspond to less noisy signals. 

Suppose we now have a sampled signal s[n\. We know that if s contains aliases, 
then these correspond to large-scale patterns, which means that we will find that the 
autocorrelation function will have very broad support. In other words, if there are 
large patterns, then large shifts of the signal will cause it to align with itself to a 
significant degree, and this will be reflected by a large value in the autocorrelation. 
The wider the support of the autocorrelation function, the narrower the support of 
the power spectral density. 

To summarize the above discussion, we have noted that when P(u) is broad, 
the signal is noisy. When C(t) is broad, the signal has aliasing. Because the two 
functions are related by the Fourier transform, as one becomes more broad, the other 
becomes more narrow. 

Using the definitions from above, a measure of the support of the autocorrelation 
function is given by 



and the support of the is given by 


(Acj) 2 = 



Then following the same arguments as before, we find 

At Aw > - 

47r 


(9.52) 


(9.53) 


Equation 9.53 tells us that when we have a signal with frequencies above the 
Nyquist rate, we can force either the aliasing or the noise in our reconstructed signal 
to be as low as we want, but that below a certain limit, further reductions in aliasing 
will correspond to an increase in noise, and vice versa. We cannot reconstruct an 
undersampled signal and simultaneously reduce both aliasing and noise indefinitely. 
At some point, reducing the aliasing will increase the noise, and vice versa. This just 
says that the extra energy in the signal has to go somewhere, which means either 
structured or unstructured (aliasing or noise) artifacts. 
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9.6 Nenuniforat Reconstruction 

The theory of reconstruction from nonuniform samples is not as complete as the 
theory that covers uniform samples. 

The basics of the theory were laid down in Yen [497]. Since then there have 
been a variety of approaches which vary in speed, precision, and efficiency. A survey 
of reconstruction methods published through 1986 is given in Sections V and VI 
of Marvasti [283]; a thorough up-to-date summary is available in Feichtinger and 
Grochenig [143]. One of the biggest difficulties in nonuniform reconstruction is 
that for a given bandlimited signal f(t) that has been sampled at a finite number of 
arbitrary sample points £*, the reconstruction process does not necessarily produce 
a unique result [244]. This means that in general some extra information has to be 
introduced into the algorithm. This information is typically either provided by the 
system or assumed by the nature of the reconstruction method. 

Each algorithm for nonuniform reconstruction uses a different set of introduced 
information, making the results and descriptions somewhat different. 

Since there is no unified theory available for this field right now, we will take a 
pragmatic approach in this book and present several different reconstruction meth¬ 
ods in survey form in the next chapter. 

9.7 Further Reeding 

The basic theory behind nonuniform sampling was developed by Yen in a classic 
paper in 1956 [497]. This work was followed by Leneman [262-265]. Most of this 
theoretical work is quickly surveyed and put to use for rendering by Dippe and Wold 
[124] and by Cook [101]. One of the first discussions of jitter for time sampling was 
presented by Balakrishnan [27]. An extensive discussion of nonuniform sampling 
and reconstruction from a particular point of view is offered by Marvasti in his book 
[283]. 

Resnikoff offers a discussion of the uncertainty principle from an information- 
theory point of view in [358]. An alternative and highly readable derivation of this 
principle is offered by Hamming [185]. 


9.8 Exorcises 

IxmcIm 9.1 

Find the true mean of the signal 3a; 2 + 2a; over the interval [0,4]. Implement the 
algorithms of Figure 9.4 with T = 0.01 and n = 10, and Figure 9.6 with T = 0.01, 
p = 10, ri\ = 20, ri 2 = 30. Run your algorithms several times using different seed 
values for the random number generators. Provide and explain your results. 
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IxtNlst 9.3 

Using the metric of Equation 9.30, rank the 8 x 8 uniform sampling matrices over 
the interval [0,1] for the following reconstruction filters $( 2 :). 

(a) sinc(x) 

(b) e-* 2 

(c) |.5 - x\ 

Exercise 9.3 

We said that sometimes the reconstruction filter is used as an estimate of the im¬ 
portance function for sampling. Suppose we have a signal / (x) = x 2 and a recon¬ 
struction filter h(x) = e~ x over the domain [0,3]. Write a Monte Carlo program to 
compute an integral using importance sampling. Find the value of the integral, and 
plot the error as a function of the number of samples for 1 to 100 samples using the 
following importance functions. 

(a) g(x) = 1 

(b) g(x) = e~ %2 

(c) g(x) = 5e-* 2 

Interpret your results; was using the filter a good choice? 




The surest proof of the authenticity of my 
invention, I believe will be given by describing 
the motives which led nte to its development, 
and by explaining the acoustical and mechanical 
principles of which I made application; for he 
atone is capable of carrying out a rational work, 
who is able to give a complete account of the 
why and wherefore of every detail from its 
conception to its completion. 

Theobald Boehm 

(“The Flute and Flute Playing," 1922) 



SURVEY OF SAMPLING AND 
RECONSTRUCTION TECHNIQUES 


10.1 Introduction 

In this chapter we will survey the sampling and reconstruction schemes developed 
in recent years for computer graphics applications. The theory of perfect sampling 
and reconstruction of bandlimited signals presented in previous chapters is only a 
starting point for practical methods. 

The techniques described in this chapter are usually appropriate for any number 
of dimensions, though the emphasis will be on 2D signals. Most of these methods 
were originally presented in terms of sampling the image plane, so in the literature 
there is much discussion of “the image” rather than the signal, and “pixels” rather 
than new samples of the reconstructed signal. 

In this chapter I will refer to the 2D signal being evaluated as the signal (or 
function) f(u,v). This is meant to encompass any 2D distribution, such as an 
illumination sphere (where (u, v) refer to direction angles ( 6 ,0)) or the image plane 
(where ( u , v) refer to a screen location (x, y)). The general plan will be to eventually 
build up a set of n samples s n = f{p n ) at the locations given by the p n . When all the 
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samples have been evaluated, they are used to reconstruct a new function f(u,v) 9 
which is then typically resampled at a new set of resample locations p n giving new 
resample values s n = f(p n ). This is summarized in Equation 10.1. 

f(x,y) —4 S n = f (p n ) —4 f(x,y ) —4 Sn = f(p n ) (10.1) 

The most important property of 3D image synthesis that we must keep in mind 
when rendering is that we cannot guarantee that our input signal is bandlimited. In 
fact, it usually will not be. Boundary edges, texture discontinuities, and noncontin- 
uous shading functions will all introduce high frequencies into the signal. 

Early polygon-rendering systems were able to prefilter the input data so that it 
was bandlimited before sampling [109,325], and thus the sampling theory of the 
previous chapters could be pretty much implemented in a straightforward way. But 
modern databases now contain complex geometry, quickly varying surface textures 
and shading models, and even volumetric components such as smoke and fog. When 
these models are combined with sophisticated rendering techniques that model mo¬ 
tion blur, depth of field, and the propagation of light by multiple objects, the task 
of prefiltering appears insurmountable. Thus the perfect sampling and reconstruc¬ 
tion theorems cannot be used directly, though the ideas behind them will still prove 
useful. 

Because databases are so complex that analytic techniques (like prefiltering) are 
very difficult to apply, most modern rendering is done by sampling the signal with 
many point samples and ultimately deriving a new set of samples for further com¬ 
putation or display, as in Equation 10.1. Most of these systems are guided by the 
following principle: 

The Sampler’s Credo: Every sample is precious . 

Most rendering systems spend the bulk of their time valuating samples, which 
can involve ray tracing, evaluating shading functions, and physical simulation such 
as calculating motion and deformations. These are very expensive operations, so we 
want to minimize the number of samples we evaluate and get the maximum benefit 
from each one. 

As rendering and modeling methods get more sophisticated, we seem to be on 
a trend that makes analytic solutions ever more remote and the cost of sample 
evaluation ever more expensive. A modern rendering system is as conservative as 
possible when it comes to taking more samples, always taking the fewest number 
possible to get the required quality of estimate of the signal. 

The techniques in this chapter are designed to try to find the minimum number of 
samples required to estimate a signal with some specified degree of certainty. Some 
published reports address just one step of Equation 10.1; others present a set of 
coordinated techniques that handle all three operations. 
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There are some general principles that are common to most of these published 
methods. First, the signal is sampled as though it had a Nyquist rate comparable 
to the frequency of the expected reconstruction samples. For example, if the signal 
is energy falling on the image plane, then the initial sampling density is typically 
on the order of one sample per pixel or per group of pixels. The values of these 
samples are then examined, and in regions where the signal appears to have a very 
high bandwidth, more samples are created and evaluated. This is called adaptive 
refinement , and may be repeated until either the samples are judged to represent 
a good local estimate of the signal, or some recursion limit is reached. Then the 
sample values are used to reconstruct a signal approximating the input, and this is 
resampled to derive new sample values. These new samples may be placed into pixel 
locations in a frame buffer or used as input to a shading algorithm. 


1 0.2 General Outline eff Signal Estimation 

This section will describe the components of Equation 10.1 in a bit more detail. An 
expanded block diagram is shown in Figure 10.1. 


10*3 Initial Sampling Patterns 

The first step in evaluating a signal is to create an initial sampling pattern (also called 
the base pattern ), which specifies a set of sample locations. We call it the initial 
pattern because later steps in the sampling process may create additional patterns 
with new samples to increase the sampling density in some places. 

The density of the initial pattern is typically derived from the expected density 
of the resampling pattern. Each sample in the initial pattern is typically meant to 
serve as an initial estimate for one or more resamples. The sample pattern may only 
specify a pair of coordinates in the 2D domain of the signal being sampled, or it 
may have any number of associated parameters, such as time, or lens position when 
sampling the image plane. 

The expected frequency content of the signal also influences the initial pattern. If 
the signal is expected to contain significant high-frequency information, the density 
of the initial pattern may be increased. Estimating the global frequency content of 
the signal before any samples have been drawn at all is usually difficult and often 
impractical; many systems don’t even bother to try and always use the same initial 
sampling pattern without attempting to first characterize the signal at all. 

Typically the density of the initial sampling pattern is constant across the param¬ 
eter space. Intuitively, the initial sampling pattern represents an attempt to get a 
broad picture of the signal, including places where the value and derivatives of the 
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A block diagram of signal estimation. 
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signal are high and low. This gives us a general overview of the signal and directs 
the next step. 

To improve on the initial estimate, many techniques then enter into a loop called 
adaptive refinement. This step is motivated by the observation that in computer 
graphics many signals contain large areas where the value is constant or only slowly 
changing. In these regions there are no sharp edges or changes in value, so the 
signal is bandlimited with a very low upper frequency. This suggests that in this 
neighborhood we may get a good estimate of the signal with just a small number 
of samples, perhaps even as few as the ones in the initial estimate. On the other 
hand, some regions of the signal will be complex in some way and require a closer 
examination to determine what is happening. In these regions of the image there 
are many high frequencies, so we want to increase our sampling rate in order to cut 
down on aliasing. 

To determine where more samples are needed, typically some number of nearby 
samples are examined. These samples make up the refinement test geometry. These 
samples are then evaluated with respect to some refinement test , which typically 
estimates (implicitly or explicitly) the local bandwidth of the signal in this region. If 
the test suggests that the bandwidth in this area is higher than the current sampling 
rate, the algorithm will generate new samples at locations given by the new sampling 
geometry. This process of examination and increased sampling repeats until the 
criteria are satisfied. Typically the refinement criteria include a cutoff test to prevent 
runaway sampling in pathological cases, and impose an upper limit on sampling in 
general. 

When the signal is sampled to some degree of confidence or quality, we typically 
want to reconstruct it. We may wish to filter the signal if necessary before it is 
resampled to yield a new set of sample values. Typical uses for the resamples include 
evaluation in a reflection function or storage as pixel values in a frame buffer. 

We will now look at the various approaches to each of these steps. 


10.4 Uniform and Nonuniform Sampling 

There are two types of sampling patterns: uniform (also called regular or periodic) 
and nonuniform. Some nonuniform patterns are random (or stochastic) and are 
generated on demand by algorithms that use random numbers as part of the pattern- 
generation process. In general, early algorithms tended to be uniform, while more 
recent techniques are nonuniform. 

The attraction of uniform sampling is that if we assume our signal is bandlimited 
(that is, F(u>) = 0 for all u> > up), then we can apply the uniform sampling and 
reconstruction theory of previous chapters to guide our sampling process. This 
assumption is unlikely to be true in general, because there are many high-frequency 
components in graphics signals, as discussed earlier. But if we assume that the 
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energy in most signals used in graphics decreases with increasing frequency, then as 
we increase the sampling rate we decrease the aliasing error. 

One of the best ways to get an intuition for the effects of aliasing is to consider 
what happens to an image when it is undersampled. When an image is sampled with 
a regular pattern that is too low, a variety of aliasing structures are visible in the 
result. We use the term “structure” because the artifacts due to aliasing often take 
the form of clearly visible patterns. 

For example, consider Figure 10.2, based on an example from Mitchell and 
Netravali [310]. Here a regular square grid has been used for the geometry of the 
sample points. Notice that as the frequency goes up, the quality of the sampling 
decreases. On the right side of the figure is a very clear set of concentric rings. These 
rings are the result of sampling error; or aliasing. 

Another example, from Cook [101], is shown in Figure 10.3. The image function 
here is a row of narrow white triangles on a black background. The figure shows the 
structure of the triangles and the result of sampling with one sample in the center of 
each pixel. The resulting image is badly aliased, but the error is so strongly structured 
that it appears to be a perfectly reasonable pattern. Even if the sampling density is 
increased to four uniform samples per pixel, we still suffer badly from aliasing, yet 
the final image contains obvious structure. 

Error is inevitable when we try to evaluate any signal by point sampling. The 
problem is that our functions are typically not bandlimited, so the sampling theory 
guarantees us that we can never take enough samples to capture the complete signal. 
From a practical standpoint, we can never be sure that we have sampled all the fine 
structure in a signal, even inadequately. Fine detail in graphics signals can come from 
many sources, including geometry, shading, texturing, and motion. It is unclear that 
we will ever have techniques guaranteed to give us accurate and useful upper limits 
on the local bandwidth, and without them we must resort to guessing the appropriate 
sample rate. Whenever we guess too low, the result will be an incorrect estimate of 
the function. Whenever we guess too high, we waste time. 

When we sample with a regular pattern, structure in the signal combines with 
structure in the pattern to create new structure in the samples; this is the source of 
the structures we saw in the figures above. This is visible in everyday life: when 
two mesh screens are placed over one another, a set of rich moire patterns emerge, 
particularly when one screen is moved over the other. 

The heart of the problem is that the two patterns, one each inherent in the 
signal and created by the sampling geometry, are combining with each other to 
create new patterns. As we mentioned above, these patterns are often visible in 
images. Without an upper limit on the local bandwidth of the signal, we cannot 
avoid aliasing; and if the sampling geometry is regular, we always run the risk 
of introducing new, extraneous structures into our reconstructed signals. If extra 
structures are appearing because of interference between two patterns, it may seem 
possible to eliminate the structures by changing one of the patterns enough so that 
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f I4IIKI 10.2 

(a) A series of concentric rings sampled on a dense uniform grid, (b) The signal sampled on a 
uniform 128 x 128 grid. 























1.01 

pixels 



PIOUS! 10.3 

(a) The geometry of a test pattern, (b) Aliasing caused by regular sampling, (c) One sample per 
pixel, (d) Two samples per pixel, (e) Three samples per pixel, (f) Four samples per pixel. Based on 
Cook in ACM Transactions on Graphics , fig. 12a-f, p. 66. 
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there cannot be any regular interference. We can’t change the signal (and probably 
wouldn’t want to if we could), but we can change the sampling pattern. 

This observation is the driving principle behind the variety of nonuniform sam¬ 
pling techniques described below. As we saw in Chapter 9, nonuniform sampling 
tends to introduce uncorrelated noise into a signal. If our sampling rate is too low, 
our reconstructed signal will still contain errors, but they will be of a noisy form that 
is less objectionable to the visual system than the structural errors that result from 
uniform sampling. 


10.5 Initial Sampling 

There are two principal approaches to creating an initial sampling geometry: uniform 
and nonuniform. Most methods fit into one of these categories; some are hybrids. We 
will focus our discussion on generating samples of a 2D signal; some generalizations 
to higher dimensions will be discussed near the end of the chapter. We will take these 
in turn. 

In this chapter we will illustrate the creation of some patterns with short 
pseudocode algorithms, inspired by Shirley [397]. We posit a function 
randomlnterval (a,b) that produces a uniformly distributed random number 
on the interval [a, 6]. Three common intervals have their own shorthand, summa¬ 
rized in Equation 10.2. 


unit () = randomlnterval(0,1) 
symmetric (a) = randomInterval(—a, a) 

range (a) = randomlnterval(0, a) (10.2) 

The function randomlnteger (a, b), where a, 6 € Z, returns a uniformly dis¬ 
tributed random integer in the range [a, b\. We also define a Boolean function 
flip () that returns true or false with equal probability. 


10.5.1 Uniform Sampling 

Uniform sampling patterns may be described with respect to a lattice . A 2D lattice 
is a set of points generated by combining two basis vectors in all possible ways. 
Figure 10.4 shows two examples. The most common uniform sampling pattern in 
computer graphics is the rectangular lattice. 
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(a) A rectangular lattice and its basis vectors, (b) A triangular lattice and its basis vectors. 
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MOUII 10.5 

Supersampling cells. Each contains 2x3 pixels. 
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for r «— 0 to h — 1 
for c 4 — 0 to w — 1 
k «— ( rw ) + c 
Xk «- (1 + 2c)/(2w) 
Vk <- (1 + 2r)/(2h) 
endfor 
endfor 


Regular sampling aligned to resample grid. 


10.8.2 Rectangular Lattica 

The rectangular (or square) lattice is created by two perpendicular vectors, as shown 
in Figure 10.5. 

This pattern is used extensively for sampling the image plane. This is probably 
because this pattern matches the resample locations, which are pixel centers in a 
frame buffer. When used for image sampling, it is common to think of the frame 
buffer as grouped into rectangular cells , each one enclosing ra x n pixels [6,64,196, 
228,361,436,477,491]. Sometimes the cells are 1 x 1, so the lattice geometry and 
the pixels coincide. 

The locations of a regular lattice on a grid of resolution w x h may be found from 
the code in Figure 10.6. Here we have used the convention that places the center of 
pixel (x, y) at (x 4- 0.5, y + 0.5) [208]. 

This sampling lattice may be displaced with respect to the pixel centers. If the 
lattice is the same size as the pixels but is translated by a half-pixel in both directions, 
the lattice points will fall on pixel corners, as in Figure 10.7 [477,491]. This causes 
the pattern to enlarge by one sample in dimension. The code in 2D is shown in 
Figure 10.8. 


10.8.3 Haxogonol Lottica 

The hexagonal lattice is shown in Figure 10.9. This is called a hexagonal lattice 
because each sample has six nearest neighbors. This sampling pattern has occasion¬ 
ally been used in computer graphics [124], but it is common in image processing 
[130,345]. 

The code for a hexagonal lattice corresponding to Figure 10.9 is given in Fig¬ 
ure 10.10. Here we assume that the hexagon has a height of 2 units, so each edge is 
of length 2/>/3. 






(a) The lattice points fall on pixel centers, (b) The lattice points fall on pixel corners. 

for r <— 0 to h 
for c <— 0 to w 

k <— (r(w + 1)) + c 
Xk «— c/w 
lIk «- r/h 
endfor 
endfor 


MOURI 10.1 

Regular sampling displaced one-half pixel to resample grid. 


The hexagonal lattice has several attractive qualities. The density of this lattice 
is higher than the square lattice, so there are more samples from this lattice in any 
given area. If we are sampling a signal whose spectrum lies within a circle, the 
hexagonal lattice requires 13.4% fewer samples to represent the signal accurately 
than a rectangular lattice [130]. 

In fact, the hexagonal lattice is the densest possible lattice in 2D. To see this, 
consider a signal with a circular Fourier transform. How densely can we fill the 











for r «— 0 to h — 1 
for c i — 0 to w — 1 
k <— ( rw ) + c 
x k <- (3c)/\/3 
j//t <— 2(r + (c mod 2)) 
endfor 
endfor 


Hexagonal (or triangular) sampling. 
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plane with copies of this circular spectrum? The hexagonal arrangement is the 
densest tiling of the plane with circles. 

The hexagonal lattice is also more isotropic than the square lattice, so there is 
more uniform sampling in each direction. For example, the nearest neighbors to any 
sample in a square lattice may be found in only two directions; in the hexagonal 
lattice there are neighbors in three directions. 

The hexagonal lattice does have some drawbacks. One possible inconvenience 
from a programming point of view is that the sample points do not fall on integer 
coordinates. Another; somewhat more serious problem is that the hexagon is not 
a reptile. A reptile is a shape that can be decomposed into smaller copies of itself 
without overlapping or gaps, as we saw in Figure 6.5. Reptiles are easily used for 
adaptive supersampling; nonreptiles are harder. 


10.5.4 Triangular Lottie* 

Related to the hexagonal lattice is the isosceles triangular lattice , shown in Fig¬ 
ure 10.11. 

Although it has been used in graphics [405], this lattice is rarely used. One 
inconvenience is that sample points do not fall on integer locations, a trait shared 
with the hexagonal lattice. Another problem is that we must keep track of the sense 
of each triangle (upward- or downward-pointing). 


10.5.5 Diamond Lottie* 

The diamond (or quincunx) lattice is shown in Figure 10.12. This pattern has only 
been used for image-plane sampling [59], where it is shown in Figure 10.13 with 
respect to a set of pixels. 

This pattern is only recognizable as a diamond when compared to a rectangular 
resample pattern; otherwise it is simply a rotated square lattice. The diamond lattice 
is interesting because it matches the directional sensitivity of the human eye, as 
discussed in Chapter 1. It is denser in directions in which the eye is sensitive. The 
initial sampling pattern with this lattice may also be defined by cells that cover many 
pixels [59]. 


10.5.6 Comparison of Subdhridod Hoxngonal and Square Latticos 

It is interesting to observe how the number and density of samples varies for square 
and hexagonal lattices as they are subdivided. Figure 10.14 shows how the two types 
of lattices may be subdivided to make similar lattices of a higher density. Each level 




MOUftl 10.11 

A triangular lattice. 



A diamond lattice. 
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PIOURI 10.13 

A diamond pattern over a set of pixels. 


of subdivision is called a generation ; the first level is called generation 0, the next 
generation 1, and so on. We tile the hexagonal lattice with triangles for uniformity. 

We will use g to represent the generation number, and s g to represent the number 
of samples involved in generation g. The value d g is the number of new samples 
taken to move from generation g - l to g, so d g = s g - s g - Example values for 
different generations are shown in Table 10.1, where we use the notation 

a 

a + = = 1 + 2 +- N = (a/2)(a + 1) (10.3) 

i=l 

The incremental number of samples in each generation for the square grid is given by 

d D ( 5 ) = (l + 2+- (1 + 2S- 1 ) 2 

= 2 9_1 [(l + 2 9_1 ) (1 + 2 9 )] (10.4) 


and for the triangular grid by 

d A (g) = ( 1+2S) + - (1+2S-+ 
= 3 (2 9-1 ) + 


(10.5) 
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mouii 10.14 

(a) A square lattice, (b) The square lattice subdivided, (c) A hexagonal lattice, (d) The hexagonal 
lattice subdivided. 


Gener¬ 

ation 

Total 

tiles 

New si 

Square 

imples 

Hex 

Total s 

Square 

amples 

Hex 

Samples per tile 

0 

1 

4 

3 

4 

3 

4.00 

3.00 

1 

4 

5 

3 

9 

6 

2.25 

1.50 

2 

16 

16 

9 

25 

15 

1.56 

0.94 

3 

64 

56 

30 

81 

45 

1.27 

0.70 

4 


208 

108 

289 

153 

1.13 

0.60 

9 

4 9 

da(g) 

t*A (9) 

(1 + 2») 2 

(1 + 29) + 

(1 + 29) 2 /49 

(1 +29J + /49 


TABU 10.1 

Lattice densities. 
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We can see that for any given level of subdivision, the triangular grid requires 
fewer samples overall than the square grid and produces fewer samples per tile. This 
is advantageous in situations where we want a roughly uniform coverage of the 
domain with the least number of tiles. This advantage must be balanced against the 
different shapes of the grids; often a square lattice is just the right shape for a 2D 
signal in computer graphics. 


10.5.7 No iw Honi Unm pH«g 

In general, the phrase nonuniform sampling refers to any sampling technique that 
produces a sampling pattern that is not periodic. 

The sampling techniques discussed in Chapter 8 are all candidates for nonuni¬ 
form sampling of a signal. There are two primary types of nonuniform sampling: 
patterned and random (or, more specifically, quasi-random ). 

Patterned nonuniform samples are used when a known nonuniform distribution 
of samples is desired. They are typically used to sample 2D signals with a known, 
but awkward, parameterization. For example, the surface of a sphere may be 
parameterized by a 2D function corresponding to spherical angles, but equally spaced 
samples in this parameter space do not correspond to equally spaced samples on the 
sphere. 

Because generating most types of nonuniform sample geometries is often a time- 
consuming process, some algorithms create the samples before rendering begins, 
and then save them in a file. The sample locations may then be retrieved and used 
immediately as needed. Because of this, the line between random and deterministic 
nonuniform patterns becomes blurred when rendering with these techniques. Other 
methods, which generate samples on the fly as needed, are closer to a random process. 


10.5.5 RoIssm foMpliag 

The simplest random pattern consists of a series of samples that have no relationship 
to each other and none between their coordinate values. To generate random samples 
in 72 n , we simply pick n uniformly distributed random numbers and use them as the 
sample location. This is called Poisson (or random) sampling , and is illustrated by 
the pseudocode in Figure 10.15. 


10.5.9 N-Rooki jqapiil 

A technique called N-rooks sampling is useful for sampling a signal that has been 
stratified in an N x N grid [397,402]. An example is shown in Figure 10.16. 
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for i 4- 0 to N — 1 
Xi «— unit () 

Vi 4- unit () 
endfor 


PIOURI 10.15 

Poisson (or random) sampling. 



PIOURI 10.16 

An N-rooks sampling pattern. 


In this pattern, one sample is taken in each row and each column. The name is 
inspired by the chess piece called the rook, which can only move along rows and 
columns of the board in an L - shaped pattern — two squar e s in on e direction and one 
squar e orthogonal to it . If the grid is thought of as a chessboard and the samples are 
rooks, then no rook can capture another (i.e., land on its square) in one move, since 
no two samples are in the same row or column. 

To make an N-rooks pattern, begin by placing samples along the main diagonal 
of the grid, and then randomly permute (or shuffle) the columns. The code for an 
N x N grid is shown in Figure 10.17. Here the function permute () shuffles its 
arguments into a new, random order. 
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for i <- 0 to N 
X{ «— i/(N — 1) 
yi <r- i/(N - 1) 
endfor 
permute (x,) 


FIGURE 10.17 

N-rooks sampling code. 


for r «— 0 to h — 1 
for c 4 — 0 to w — 1 
k <— ( rw ) 4- c 

Xk <- (1 4- 2c4- symmetric (l/2w))/(2w) 
yk (1 4* 2r 4- symmetric ( \/2h))/(2h) 
endfor 
endfor 


PIOURI 10.18 

Regular sampling aligned to resample grid. 


10*5.10 Jitter Distribution 

A sampling pattern may be perturbed by the addition of jitter . Jitter, or random 
displacement, may be applied to any pattern based on stratified sampling by moving 
each sample to a random position within its piece of the domain. So it is easy to 
add jitter to any of the uniform patterns, or the N-rooks pattern, simply by adding 
an appropriate amount of random displacement to each sample (taking care that the 
displacement keeps the sample within its domain). 

The code in Figure 10.18 shows how to create a jittered regular square grid by 
adding noise to Figure 10.6. The noise is enough to move the sample as much as 
halfway toward either neighbor horizontally and vertically. 

An example of this algorithm is shown in Figure 10.19, along with the magnitude 
of its Fourier transform. 

The hexagonal lattice may also be jittered, though the geometry is slightly more 
complex. Figure 10.20 shows a hexagon divided into twelve equivalent regions. 

We can jitter a sample in the hexagon’s center by generating a random displace¬ 
ment in the region marked /, perhaps reflecting it about the y axis into the region 
F, and then rotating it by some integer multiple of 60° = 7 t/3. To pick a random 
point within /, we first pick an angle 6 € [7t/3, 7t/2] and then a distance d from 
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HOUR! 10.19 

(a) A jittered rectangular lattice, (b) The magnitude Fourier transform of (a). 


the center; a little trig shows that the maximum value for d as a function of 6 is 
dmax = l/(2cos0). The pseudocode is shown in Figure 10.21. 


10.5* 11 Polftson-Disk Pattern 

A Poisson-disk distribution is a random pattern that satisfies the Poisson-disk crite¬ 
rion : no two samples are closer together than some distance r p . The name of this 
pattern is inspired by the idea of surrounding each random sample by a disk of the 
given radius, such that no two disks overlap. We also usually want the samples to 
be as close together as the disks allow. 


10.5.1 2 Precomputed Poisson-Dlsk Patterns 

A theoretically proper way to create a Poisson-disk pattern is to generate a large 
number of random patterns with Poisson statistics and use only those that satisfy the 
Poisson-disk criterion [307]. 

This is not a very efficient way to generate such a pattern, so a variety of al¬ 
ternatives have been devised. One popular approach is to build a small pattern 
(sometimes called a prototype or tile) that satisfies the Poisson-disk criterion and 
then save that pattern in a table. For convenience, we will assume that the tile has 
unit parameterization; that is, it spans the domain [0,0] to (1,1) [101,124]. The 
table may then be replicated with rotations and reflections to cover the sampling 
domain, as discussed below. 








A hexagon broken up into twelve equivalent regions, 
(b) Finding a point within /. 

, (a) The initial (I) and flipped {F) regions. 

for r «- 0 to h - 1 

for c «— 0 to w - 1 

Scan all rows and columns . 

0 «— randomlnterval (7 t/3,7t/2) 
d <- range (1/(2 cos 0)) 

Ax <— d cos 0 

Ay 4— d sin 8 

Pick a random point in the primary region . 

if flip() then 


Ax < - Ax 

endif 

Perhaps flip it into region R 

0 4- (7t/ 3)* randomlnteger(0,5) 

Pick one of six sides to rotate into . 

Ax' <r- Ax cos 0 + Ay sin 0 

Ay' < - Ax sin 0 + Ay cos 0 

Rotate the jitter vector. 

k «— (rw) + c 

X* 4- (3c)/\/3 + Ax' 

t/* 4- 2(r + (c mod 2)) + Ay' 

Add the jitter into the hexagon center. 

endfor 


endfor 



noun lo.ai 

Jittering a hexagonal lattice. 
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if- 0 

while i < N 


X{ «— unit () 

Vi f- unit () 
reject «— false 

Throw a dart. 

for k f- 0 to i — 1 
d <- (xi - x k ) 2 + (yi - ykj 2 

Check the distance to all other samples. 

if d < (2r p ) 2 then 


reject «— true 

break 

endif 

This one is too close—forget it. 

endfor 

if not reject then 


i 4- i + 1 

endif 

Append this one to the pattern. 

endwhile 


mouri 10.22 

Building a Poisson-disk pattern by dart throwing. 


One approach to building the pattern is called dart-throwing [101,124,307]. In 
this technique, randomly distributed samples are generated one by one and added 
into an accumulating pattern. Each new sample is compared against all the previously 
accepted samples; if the distance to its nearest neighbor is equal to or greater than 
the Poisson-disk distance r p , that sample is accepted and added into the pattern. The 
pseudocode for this technique is shown in Figure 10.22. An example of the result of 
this algorithm is shown in Figure 10.23. 

If the tile is large with respect to the eventual resampling frequency, then we 
can postulate that most of the aliasing errors will turn into noise, and only those 
structures that are very large in the image will turn into correlated aliasing errors in 
the sampled signal. If this assumption is valid, then we can evaluate the intersample 
distances in the dart-throwing algorithm as though the sample tile was actually a 
torus: the left and right sides are sewn together, as well as the top and bottom. This 
is so that the minimum-distance criterion is still satisfied when two tiles are placed 
side by side. 

Dart-throwing is an expensive simulation algorithm and is only practical for small 
numbers of samples. It is also possible for the simulation to leave large holes in the 
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FIGURE 10.23 

(a) A pattern built by dart throwing, showing the local disks, (b) The pattern in (a) without the 
disks, (c) The magnitude of the Fourier transform of (b). 


sampling pattern even after testing many candidates. Another problem with dart 
throwing is that the input to the simulation is the radius of the minimum-separation 
disk, so it is difficult to generate patterns with a predetermined number of samples. 


10.5.13 MvMplo-Scalo Polison-DIfk Patterns 

In the overview discussed at the start of this chapter, the method of adaptive refine¬ 
ment was mentioned as a common technique for increasing the sampling rate locally 
in regions of high bandwidth. A number of different approaches will be discussed 
later in this chapter, but it is useful right now to mention that most of them are based 
on either subdivision or the use of an entirely new higher density sampling pattern. 

We will look at two techniques for building patterns that may be used for increas¬ 
ing the sampling density locally while maintaining the statistical characteristics of 
the samples. Both are based on building a sample tile with the desired characteristics 
and then replicating that tile across the sampling domain. In particular, we can build 
a pattern by drawing sequential samples one by one, in predetermined order, from 
a precomputed list of sample positions. The list is built so that when sample n is 
drawn, the list of samples so to s n _i satisfy the Poisson-disk criterion. The radius 
used in the test decreases with increasing n. So by taking more samples from the list, 
the effective sampling rate goes up, while retaining a Poisson-disk characteristic. The 
complete list forms a tile that should be reflected and rotated to cover the sampling 
domain, as with other precomputed patterns. 

Both methods are variations on dart throwing, and are somewhat similar. We 
call the approach suggested by Mitchell [308] the best candidate algorithm. It takes 
as input two values: the number of desired samples N and a quality parameter q . 
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xo <- unit () 
yo <- unit () 

Plant the initial sample. 

for i 4- 1 to N 


dmax ^ ^1, 


for c 4 — 1 to iq 


u 4- unit () 

Make a candidate. 

v <— unit () 


d c 4 — 1 


for k 4- 0 to i — 1 


d 4- (% - u) 2 + (% - v) 2 

Get distance to nearest existing sample. 

d c 4— min(d c , d) 


endfor 


if d c > d m&x then 


X{ 4- u 


Vi 4-v 

Take this if it's the best so far. 

dc 


endif 


endfor 


endfor 



MOUtl 10.24 


The best candidate algorithm. 


The algorithm begins by placing a single sample at some random position in the tile; 
this is sample 1. The algorithm then iterates a placement procedure N — 1 times. In 
iteration i, the placement procedure creates iq uniformly distributed samples in the 
tile. These new samples are compared to the existing ones, and the sample farthest 
from all the others is added into the pattern. Since the number of samples is scaled 
by i, there is a constant ratio of candidates to existing samples. This algorithm is 
given in pseudocode in Figure 10.24. 

As the value of q is increased, a larger number of candidates is generated each 
time around the loop. This is useful because the algorithm selects the best candidate 
in each iteration. The more candidates that are tested, the better our chance of 
finding one near the optimal position. This technique may be accelerated by using 
standard techniques for spatial search such as grids and quadtrees [373,374]. 

By keeping a table identifying the samples in the order of their creation, we can 
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use the first N samples to get an approximate Poisson-disk estimate for different 
sampling densities. An example of this algorithm is shown in Figure 10.25. 

Figure 10.25 shows an interesting phenomenon: the Fourier transform of the 
sampling pattern has a very specific structure. In the Fourier transform, there’s 
a spike at DC (in the center), then a ring of low magnitude, and then, beyond a 
certain distance from the center, the higher frequencies start to assert themselves 
in a noisy way. As we add more samples to our pattern, the low-magnitude ring 
around the center spike becomes larger. This is to be expected, since we know that 
multiplying a signal by this sampling pattern corresponds to convolving with the 
sampling pattern’s Fourier transform; with a wide ring around the center of the 
transform, nearby transforms will not overlap in the convolution. Distant samples 
will contribute some energy, though, because the magnitude of the transform comes 
back up outside of the center ring. But because this extra energy is noisy, it will not be 
strongly correlated with the sampling pattern, and the visual system will be inclined 
to tolerate it as noise, rather than interpret it as structured pattern. So we want 
this inner ring to be as large as possible (to keep interaction between local neighbors 
down, and thus reduce the introduction of small-scale patterns); the closer-packed 
the samples are in the signal domain, the larger the ring in the Fourier domain. This 
“spike-and-ring” pattern is a characteristic of Poisson-disk-like sampling patterns. 

A similar approach to creating a multiple-scale sampling tile, presented by 
McCool and Fiume [294], is called the decreasing radius algorithm. The user spec¬ 
ifies N , the number of samples desired in the final tile, a magnification parameter 
m, a disk-reduction fraction /, and a quality parameter q. The algorithm begins by 
placing one point at random within the tile and setting the Poisson-disk radius r p to 
a large number (e.g., the width of the tile). Then an iteration is started that loops N 
times. To find sample i, a new loop is entered that creates and tests new uniformly 
distributed candidate points in the tile. This loop repeats until one of the candidates 
satisfies the Poisson-disk criterion, or imq candidates have been tried. The value 
of m is used to adjust the number of candidates created as the number of samples 
already placed increases. If a new sample satisfies the Poisson-disk criteria, that sam¬ 
ple is added to the pattern. Otherwise, the disk radius is decreased by the fraction 
/ and the candidates are tried again. The process is summarized in pseudocode in 
Figure 10.26. 

One way to visualize this algorithm is as a series of cones, as in Figure 10.27 
Each time a sample is placed, it establishes the central axis of a right circular cone 
perpendicular to the plane of the tile. The angle of the cones is controlled by /. 
The algorithm sweeps the tile plane downward in equal steps toward the plane that 
contains the apex of all the cones (i.e., the point samples themselves). Each time the 
plane is moved, the algorithm tries to insert a new cone with initial radius equal to 
the radius of all other cones at that level. If no such cones can be fit, the plane is 
swept downward again. 

Examples of this algorithm similar to those for the best candidate algorithm 



10.5 Initial Sampling 


433 



MOURI 10.25 

The best candidate algorithm for N = 400 and q = 10. (a) The first 150 samples, (b) The 
magnitude Fourier transform of (a), (c) The first 300 samples, (d) The magnitude Fourier transform 
of (c). (e) The first 400 samples, (f) The magnitude Fourier transform of (e). 
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xo <- unit () 

Vo <- unit () 

r p 4- 1 _ 

for i 4— 1 to TV 
placed «— false 
while not placed 
$ 4 — 0 
^min <— 1 

while s < imq 
u unit () 
v <- unit () 
for j «— 0 to i - 1 
d «- (xj - u ) 2 + (y^ - v ) 2 
i f d < d m in then 

^min ^ ^ 

endif 

endfor 

if d min > r p then 
placed true 

Xj ^— u 

Vi +-v 
endif 
s s + 1 
endwhile 

r p ■<— r p * / 

endwhile 

endfor 


P/a«/ first sample and set radius. 


Throw a dart. 


Find the nearest existing sample. 


It's a good one—save it. 


HOUR! 10.36 

The decreasing radius algorithm. 
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VI0UKI 10.27 

/WWW /wwwwww^ 

The geometry of the decreasing radius algorithm. 


are shown in Figure 10.28. This algorithm also may be sped up by space-search 
techniques. 

The best candidate and decreasing radius algorithms share much in common, 
but there are differences. The best candidate algorithm is guaranteed to place a 
new sample every iteration. There is no guarantee that the pattern satisfies the 
Poisson-disk criterion at any level, since it simply chooses the best candidate. For 
the same reason, the nearest-distance value for a sequence of candidates may not 
be monotonic, which means the pattern may not be increasingly dense overall. The 
latter problem may be solved by sorting the points after they have been built [294]. 

The decreasing radius algorithm is guaranteed to satisfy the Poisson constraint, 
but it requires three user parameters rather than one. The magnification parameter 
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PIOURI 10.28 

The decreasing radius algorithm for N = 400, q = 10, m = 1, and / = 0.99. (a) The first 150 
samples, (b) The magnitude Fourier transform of (a), (c) The first 300 samples, (d) The magnitude 
Fourier transform of (c). (e) The first 400 samples, (f) The magnitude Fourier transform of (e). 
(g) The number of points placed as a function of r for (e). 
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m is actually a generalization of the best candidate algorithm’s scheme; setting m = 1 
causes these two steps to be the same. The quality parameter q is also the same in 
both algorithms. That leaves the cone angle /, the amount by which the cone is 
decreased at each step. This should be a value only slightly less than one. Larger 
values will cause the algorithm to run more slowly, and will tend to produce denser 
patterns at each level. 

Both algorithms will always terminate with the desired number of samples. The 
best candidate algorithm has the advantage that the candidate selection test may 
be generalized to create patterns of any sort, not just Poisson-disk. The decreasing 
radius algorithm will generate a better Poisson-disk pattern, but it requires the eval¬ 
uation function to be univariate and monotonically easier to satisfy with decreasing 
values of r p . The best candidate method is the more general of the two and is bet¬ 
ter for satisfying arbitrary distributions; the decreasing radius method is tuned to 
Poisson-disk patterns and is preferable for that case. 


10.5.14 Sampling Til©* 

The techniques in the last section precompute some small pattern or tile which is then 
replicated across the domain. The precomputed tile approach is attractive because 
it allows us to generate patterns that meet almost any criteria, without paying the 
generation cost at run-time. 

The drawback to the tile approach is that simple replication of the tile introduces 
the periodic sampling we want to avoid [124]. Figure 10.29(a) shows a nonuniform 
2D tile, and in Figure 10.29(b) that tile has been replicated by translation to cover 
the domain. The tile has a width w and contains n samples S{. If we consider each 
sample separately, then the operation of replicating the tile creates a uniform, square 
grid that is w units on a side, with an origin at s*, as in Figure 10.29(c). So we end 
up with n different square grids, each slightly displaced to the others. Below we will 
see a reconstruction method for this type of pattern. 

Because the repeated tile in effect creates multiple grids, we have created a pat¬ 
tern that is stochastic locally but periodic globally. The periodicity will come back 
to haunt us as large-scale structured aliasing artifacts in the signal. To avoid this 
problem, we can try introducing transformations to the tile each time it is placed. 
Suppose that the tile is square, as in Figure 10.30(a). Then there are eight possible 
linear transformations of the tile, corresponding to the eight symmetry transforma¬ 
tions that preserve the square. These are the four right-angle rotations (0, 90, 180, 
and 270°), each of which can be combined with a reflection, as in Figure 10.30(b). 

We can break up the grids created by simple translation by applying one of these 
transformations at random to each tile as it is laid down. It is often necessary, 
however; to be able to go back to a region of the domain after it has been sampled 
(when refining adaptively, for example). Rather than store an arbitrarily sized table 
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(a) A single nonuniform tile, (b) The tile of (a) covering the plane by translation, (c) One of the 
square grids induced by step (b). 




u 


.30 


(a) A square tile, (b) The eight linear transformations of (a). 


of transformations, we assign the numbers 0 through 7 arbitrarily to the eight 
transformations, and specify the appropriate operation to be applied as the value 
of a function T mapping the ( x , y) plane to the integer range [0-7]. It is important 
that this function be aperiodic over the domain being tiled; we don’t care if it 
repeats outside of that domain. These functions can be built from the standard noise 
functions used for creating textures [272,338]. 

Another alternative is to apply a continuous transformation to the tile, where the 
parameters of the transformation are also drawn from an underlying function. If 
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(a) The initial square tile and the samples in its circumscribing circle, (b) A rotation of the tile by 6 
produces a new pattern. 


the basic tile is a square, we might produce enough samples in the stored pattern to 
fill the circle circumscribing that square, as in Figure 10.31(a). Then the underlying 
parameter specifies an angle 6 by which the pattern should be rotated about the 
center of the square, as in Figure 10.31(b). The samples are clipped to the square 
before they are used. Any of these methods can introduce error in our sampling 
pattern by bringing samples too close together on the boundaries of transformed 
tiles. 


10.5.15 Dynamic Poisson-Disk Patt er ns 

The above methods for generating Poisson-disk patterns require an expensive preren¬ 
dering step and the storage of tables. Another class of techniques has been developed 
that creates approximate patterns during rendering time; we call these dynamic or 
on-demand pattern-generation techniques. 


Point -Diffusion 

Mitchell’s point-diffusion algorithm [307] adapts the technique of error diffusion 
used by dithering algorithms [445] to guide the creation of approximate Poisson- 
disk patterns. The algorithm scans a rectangular grid that has a frequency of about 
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four times the desired sampling density along each axis, and selects about one sample 
out of sixteen. The underlying grid can be hidden by jittering each chosen sample. As 
with dithering algorithms, the scan should proceed boustrophedonically (in opposite 
directions on successive lines). A “diffusion value” D is computed for each point as 
a combination of the values of D stored at its neighbors. 

For a point at (m, n), the diffusion values D are gathered from above and adjacent 
pixels to compute a temporary value T : 


T = R + 


4An-l,n "I- D — 1 + 2Dm,n -1 + Dm+l,n -1 
8 


( 10 . 6 ) 


where R is a uniform noise source in the range [3/64,5/64]. The value T is then used 
to decide whether a sample point should be selected; if so, the variable 5 is set to 1: 


r o ifr<o.5 

\ 1 otherwise 


(10.7) 


The new diffusion value stored at this sampling point is then computed as 


D m , n = T-S (10.8) 

for the values of T and S computed at this point. 

The values in Equation 10.6 were chosen experimentally to create a pattern with 
the desired Fourier characteristics, and to be inexpensive to evaluate (only shifts are 
necessary if all values are integers). 

If this value exceeds a threshold, that sample is selected for evaluation, and the 
value associated with that sample is decreased by one. The algorithm is described 
in pseudocode in Figure 10.32. The function evaluate () is called to evaluate the 
signal at the given location. A bit of noise is added to the value of D to suppress 
orderly patterns; the range of noise is chosen to cause about one grid point in sixteen 
to be selected. 

An example of the pattern generated by this algorithm and its Fourier trans¬ 
form are given in Figure 10.33. As Figure 10.33(c) shows, it is important to scan 
boustrophedonically to avoid directional artifacts in the pattern. 


Hox«|«>ii J I ttor I Bf 

Another type of dynamic, approximate Poisson-disk pattern may be generated by a 
direct application of jittering. The jittered-lattice techniques described in the previous 
section can be generated on demand by perturbing samples as they are created, but 
the methods as described do not satisfy the Poisson-disk criterion. It isn’t too hard 
to modify them to do so. 

Recall from above that the densest regular packing of samples in the place corre¬ 
sponds to a hexagonal lattice, so this is a reasonable place to start. We can produce 



Scan the high-resolution grid. 


i <— 0 

for r «— 0 to h — 1 
for c i — 0 to w — 1 


A»,c <- £> c -i,r-i 4- 4- jP c -n, r _x Get Ds from above. 

i f r mod 2 = 0 
then D r , c <- D r , c + 4D c _i, r 
else D r ,c 4 — Dr,c 4D c +i.r 

endif 


Get D from left or right as appropriate. 


D r , c «— (D r , c /8) + symmetric (1/16 4-1/64) Average neighbors and add noise. 
if D r , c > 0.5 then 

-^r,c ^ -^r,c 1 

evaluate (D TjC ) 
endif 


Above threshold—evaluate this sample. 


endfor 

endfor 


PltURI 10.32 

The point-diffusion algorithm, scanning from left to right. 



(c) 




PIOURI 10.33 

(a) A pattern generated by the point-diffusion algorithm scanning from left to right, (b) The 
magnitude Fourier transform of (a), (c) The point-diffusion algorithm with a boustreptando^ 
scanning, (d) The magnitude Fourier transform of (c). 
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for r «— 0 to h — 1 

for c 4 — 0 to w — 1 

Scan all rows and columns. 

6 <- randomlnterval (7r,) 

d <- range ((1 — 2r p )/(2cos0)) 

New value for d. 

Ax <— dcosO 

Ay <— dsinQ 

Pick a random point in the primary region . 

if flip() then 


Ax < -Ax 

endif 

Perhaps flip it into region F. 

<j ><— (7r/3)* randomlnteger (0,5) 

Pick one of six sides to rotate into. 

Ax' <— Ax cos (j> 4- Ay sin <f> 

Ay' < -Ax sin </> + Ay cos (f> 

Find the jitter vector. 

k <— ( rw ) + c 

xjt «- (3c)/\/3 + Ax' 

yk «- 2(r + (c mod 2)) + Ay' 

Add the jitter into the hexagon center. 

endfor 


endfor 



MOURI 10.34 

Jittered hexagon approximation to a Poisson-disk pattern. 


samples that satisfy the Poisson-disk constraint by making sure that they never get 
closer than r p to the outer perimeter of the hexagon. This is easily done by decreasing 
the value of d in Figure 10.21. The new maximum value of d is given by 

dmax = (1 ~ 2r p )/(2cos0) (10.9) 

In order to have room to move the sample point, the distance from the center of 
each hexagon to the midpoint of a side must be at least r p . We can now write 
the pseudocode for a jittered hexagon approximation to a Poisson-disk pattern in 
Figure 10.34; the only change to Figure 10.21 is the calculation of d. 

A result of this approach, and its Fourier transform, is shown in Figure 10.35. 


10.5.16 Importance Sampling 

In general, the samples we evaluate will make different contributions to the esti¬ 
mated signal. For example, we may use image samples to estimate the incident 
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PliURI 10.35 

(a) A regular hexagonal lattice, (b) The magnitude Fourier transform of (a), (c) A jittered hexagonal 
lattice, (d) The magnitude Fourier transform of (c). 


light on a surface, for the purpose of determining how much light is leaving that 
surface in a particular direction. Each incident direction will then be weighted by 
some reflectance function when it is taken into account. Thus we might say that 
some samples are more important than others, in the sense that they make a larger 
contribution to the final value. 

There are two general approaches to handling the different contributions of 
different samples. One approach is to distribute the samples with a uniform density 
and then weight each one appropriately, shown schematically in Figure 10.36(a). The 
other method is to distribute the samples in such a way that they fall more densely in 
regions that we know carry a larger weight. This latter approach is called importance 
sampling [182], and it has been an important part of most nonuniform sampling 
techniques in computer graphics [101,234]. This approach is shown schematically 
in Figure 10.36(b). 

Importance sampling is a useful technique in practice because it puts most of 
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Two ways to accommodate a nonflat function. (a) Equal distribution of samples, unequal weighting, 
(b) Unequal distribution of samples, equal weighting. 


the samples where they will be most influential. This helps reduce the number of 
samples needed, since we won’t be taking many samples in regions of the signal that 
are unimportant. 

In practice, to apply importance sampling to a signal, we need to know the filter 
with which the signal will be modified. There are then two approaches to using this 
information. 

Intuitively, in the first approach we divide the filter into regions of equal area (or 
volume) and cast an equal number of samples into each region. This is shown in 
Figure 10.37. The samples may be distributed uniformly or quasi-randomly within 
each region; if they are jittered, the size of the jitter must be adjusted to the size of 
the region. 
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Importance sampling by dividing a filter into regions of equal volume. 


In the second approach, we divide the domain of the filter into equal pieces, and 
then try to place samples so that the relative density of samples between regions of the 
filter is proportional to the relative areas or volumes of those regions. This is shown 
schematically in Figure 10.38. This second approach requires more bookkeeping 
and seems to be rarely used in practice. 

One practical implementation of importance sampling begins with carrying out 
the equal-volume segmentation of the filter discussed above, followed by taking 
an equal number of samples in each region [101]. This requires a preprocessing 
subdivision step, and also means we must decide how many regions to subdivide the 
signal into before we start sampling. An advantage of this approach is that we are 
guaranteed to sample all the regions of the signal; a disadvantage is that we cannot 
smoothly increase the sampling density. The method may be improved to support 
adaptive sampling by subdividing each region on demand [234]. 

Another approach is to generate uniformly distributed points in a canonical space 
(such as the unit interval or unit square), and then deform that space to match the 
desired density. This approach is discussed in some detail in [400], where Shirley 
shows the warping to match a separable 2D triangular function. This function may 
be expressed with the center at the origin as the product of two ID functions: 


w(x,y) = w(x)w(y) = (1 - |x|)(l - |y|) 


(10.10) 





FIOURI 10.38 


Dividing a filter into equal-sized regions. 


We can write the distribution function W(x) associated w(x) as the probability 
that w(x) < x for each value of x [331]. Since w(x) is the density of points at x, the 
distribution function is just the integral of w from its lower limit (here — 1) to x: 

W(x) = J (1 - |r|)dr = i +x - ^x|x| (10.11) 

The values of W(x) can be precomputed and stored in a sum table [111]. 

To make a set of samples (x, y) that match the distribution w(x, y), we generate 
pairs of uniformly distributed values and find (x,y) = W~ l (v)) 

where W~ l is the inverse of W 9 given by 


W~ 1 (a) 


— 1 + \[2a a < 0.5 

1 - i/2(l - a) otherwise 


( 10 . 12 ) 


The multiple-scale templates discussed earlier may also be used to perform im¬ 
portance sampling, using a method described by McCool [294]. The basic idea is to 
realize that the order in which points were added to the pattern roughly corresponds 
to the density of the pattern when they were created. So if a sample was added late in 
the pattern’s development, it probably represents a high-density sample and is nearer 
to at least one other sample than earlier samples. We can use this observation to 
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/min <- min(/) 

/max <- max(/) 

Get filter bounds for normalization. 

for i 4 — 1 to N 

Vi «— i/n 

Ci ^ (/(^t) /min)/(/max /min) 

For each sample, find local and 
filter value. 

if ri < C{ then 


evaluate Si 

endif 

We*re below the filter, so use this sample. 

endfor 


MQURI 10.39 

Importance-sampling a multiple-scale pattern. 


choose samples from the template, selecting many of them in regions where the filter 
has a large value and ignoring those where the filter is small. 

To use this idea, we scan through the samples one by one. For each sample, we 
estimate its local density simply by its index, which tells us when it was added to 
the pattern. We normalize this index by dividing by the total number of samples 
n. We then evaluate the filter function at this sample location, and again normalize 
(effectively scaling and shifting the function so that it runs from 0 to 1). If the sample 
density is less than the filter density, then that sample is selected and evaluated. 
Pseudocode for this algorithm is given in Figure 10.39. An example of the result for 
a particular filter is shown in Figure 10.40. 

A graphical look at this technique is shown in Figure 10.41 using the distribution 
of points as in Figure 10.27. As the filter value becomes larger, it dips into denser 
regions of the pattern and includes more samples. 


10.5.17 MvIfidiMMsionai Potttms 

Most of the patterns we have looked at are 2D in the domain of the signal. When 
the signal is a 3D scene and the samples determine visibility, they are projecting a 
3D function to 2D. In general, the patterns used in graphics project from 1Z m to 72 n , 
where m> rt. 

An important projection is the one from a 3D scene onto a 2D surface (e.g., a 
viewing plane or an illumination hemisphere). In a distribution ray tracery the 3D 
scene is augmented with a number of other parameters. For example, we might 
associate a time with each sample and a particular angle to be used for possible 
reflections off surfaces. These two additional parameters join the three spatial ones, 





MOUII 10.40 

A Poisson-disk pattern weighted by exp(-5(x 2 + y 2 )). Redrawn from McCool and Fiume in Proc. 
Graphics Interface ’92 , fig. 16, p. 102. 



MOUII 10.41 

An illustration of importance sampling of a variable-scale pattern. The filter is the volume scooping 
downward in the figure; samples inside the filter are circled. 
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so our samples are actually 5D. When all samples have been evaluated, we still 
have a 2D function f(x, y) giving the intensity of the signal as a function of spatial 
coordinates. 

The two spatial coordinates play a special role: they are the only ones that 
are conserved through the sampling process. The other parameters of the sample 
are created but then discarded once the sample has been evaluated. We will call the 
spatial parameters direct parameters of the sample, and all others indirect parameters. 
We will use the letters x and y to refer to the the direct parameters of a sample, and 
we will put them at the start of any list of dimensions associated with a sample. 
Some dimensions involved in sampling are coupled because together they represent a 
single domain (e.g., x, y on the screen or u , v texture coordinates on a surface), while 
others are independent (e.g., the time parameter). This topic has been discussed at 
length by Mitchell [308]; the discussion in this section is based on that presentation. 

The most direct way to sample d dimensions is to generate n samples, each of 
which has associated parameters 

Si = {x^, yi, U {, tj ,...} (10.13) 

where each parameter comes from an independent random variable. We know that 
stratified sampling often leads to a good estimate of a signal value faster than random 
sampling, so we can imagine dividing each dimension i into Ni regions. This creates 
a hypervolume that has Ni cells on a side, for a total of N\ x N 2 x • • • x N n cells in 
the volume. 

If there are d dimensions and we split each one into n cells, then to place one 
sample in each volume, we need n d samples. This required number of samples 
rises very quickly with both the number of dimensions and the number of cells per 
dimension (in current practice, these numbers are often more than eight and sixteen, 
respectively). 

An alternative is sampling with incomplete block designs . In this strategy, we 
imagine that the hypervolume described above is projected onto each dimension 
(or coupled pair); we design our sampling pattern so that each cell in the stratified 
projection domain is sampled at least once. For example, we fill the 3D volume 
representing (x, y, t) with samples in such a way that if we project the volume onto 
the (x, y) plane, each 2D cell contains at least one sample, and if we project the 
volume onto the t axis, each ID cell contains at least one sample. 

A means to accomplish this was suggested by Cook [101]. His application was 
image sampling. For each pixel, he stratified the (x, y) domain into sixteen regions. 
The time domain was then stratified into sixteen intervals. Each sample in the (x, y) 
plane was identified with one of the time intervals by use of a template, like the one 
in Figure 10.42. The precise time for each sample may be jittered within its interval. 
This approach can be extended to other indirect parameters, such as reflection angle 
and location on a lens. 
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MOURI 10.42 

Pattern for associating time intervals with spatial regions. Data from Cook in ACM Transactions 
on Graphics, fig. 8, p. 62. 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 


MOURI 10.43 

A bad pattern. 


Cook noted that if the placement of samples in the cells is correlated in some 
regular way, then these artifacts will influence the final estimate of the signal (in an 
image, we will see aliasing artifacts like those from regular sampling). For example, 
suppose the time values were assigned to the regions as in Figure 10.43. In the figure, 
the pixel area is shown swept over time. We have broken the time interval into four 
quarters, so samples with values 1 through 4 fall into the nearest quarter of the cube. 

Suppose a black object is moving downward over a white background in the pixel 
as in Figure 10.44(a); the object only covers 1/4 of the pixel and moves in quick steps 
every 1/4 second. This (admittedly pathological) example demonstrates the problem 
that comes up when the motion is correlated with the sample assignment. Every 
space-time cell that is sampled contains the object, so the entire pixel will be black. 
If the object is moving upward as in Figure 10.44(b), every space-time cell that is 
sampled is empty, which results in the equally incorrect answer of an entirely white 
pixel. In general, if there are correlations within a pixel, those patterns will tend to 
be amplified by the regular geometry of the pixel grid, and the resulting errors will 
be easily noticeable. Mitchell has observed that these correlations may be thought 
of as hyperplanes in the d-dimensional hypercube. 

Another way to distribute samples in the d-dimensional hypercube is with the 
d-dimensional form of N-rook sampling discussed earlier [397]. Suppose that each 
parameter is divided into n cells. We make a table of d permutations 7Ti, 7r2,..., of 
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(a) A sampling pattern in space-time, (b) One bad situation for the pattern, (c) Another bad 
situation for the pattern. 
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Sample parameters 


Permutation 

Sample 1 

Sample 2 

Sample 3 

Sample 4 

Sample 5 

7T X 

1 

5 

3 

2 

4 

n v 

3 

2 
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1 

5 

TTt 

4 

1 

5 

2 
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MOURI 10.48 

N-rooks in d dimensions. 


the numbers {1,2,..., d}; these are just d different ways to list the d integers. Then 
to place sample i in these d dimensions, assign the first parameter to region 7Ti(i), 
the second to region 7 ^( 1 ), and so on. An example is shown in Figure 10.45. 

Both of the techniques described above work well in the sense that their projec¬ 
tions onto the different dimensional axes, planes, or hyperplanes fit our criteria of 
filling each interval at least once. But except for the way templates are used in the 
first method, there is no real distinction between direct and indirect parameters. We 
might ask if we can get an equally good answer with fewer samples by recognizing 
the basic difference between these two types of parameters. 

Mitchell has reported that the distribution of the indirect parameters indeed 
makes a difference [308]. First, consider just the distribution of the direct parameters. 
We know that we want this to be as high-frequency a pattern as possible. Now select 
some circular region of these parameters, and start assigning values of an indirect 
parameter (say t). We want the distribution of t within this region to have a high- 
frequency pattern as well. If the region we selected is small, then there will not 
be many samples, so the values of t must be very different to satisfy the high- 
frequency requirement. We should thus require that samples that are nearby in the 
direct parameters be far apart in the indirect parameters. Mitchell also discusses the 
situation of handling multiple indirect parameters, some of which may be coupled. 

A variation on the point-diffusion algorithm may be used to generate these sample 
values. We begin by stratifying the direct parameters (say x and y) and then scanning 
the resulting regions. We will place one sample in each direct-parameter region and 
associate one region from each indirect parameter with each sample. These values 
may then be jittered. 

The scanning algorithm is shown in Figure 10.46 for assigning t values on the 
basis of (x, y ) parameters. We want to assign values to the square marked with a 
bullet. In this figure, we are scanning top to bottom, left to right. The cells marked 
S are previously scanned “secondary” (or second-neighbor) cells, and those marked 
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Sample generation. Data from Mitchell in Computer Graphics (Proc . Siggraph *91), fig. 7, p. 160. 


P are “primary” (or first-neighbor) cells; these have already been assigned values of 
t . An implementation should alternate left-to-right and then right-to-left scanning 
directions on successive rows. 

Mitchell presented a variation of the best-candidate algorithm, which we call the 
two-stage best-candidate algorithm . The process is two passes of the best-candidate 
procedure, designed to make the best pattern on both a local basis (i.e., with respect 
to the P cells) and a larger neighborhood (i.e., the S cells). First generate a uniformly 
distributed samples of t . Find the distance of each of these values of t to each of the 
four values of t stored at the P cells. Select the b candidates that have the largest 
minimum distance (i.e., those b values that are the farthest from any of the values 
stored in the P cells). Now repeat the process with the S cells and select the one 
candidate that is the farthest from any t in the S cells. Mitchell suggests values of 
a = 100 and b = 10. 

The two-stage algorithm may be generalized to three or more stages to incorporate 
additional indirect parameters. It is important to consider these closely because it 
is not just the projected (or marginal) distribution of each parameter that matters, 
but also the joint distribution of several parameters taken together. For example, 
Figure 10.47 shows two ID parameters attached to eight samples. The projected 
distributions of each parameter are the same (eight equally spaced points), but their 
joint distributions are quite different. The distribution on the right is almost perfectly 
correlated along a line, and hence it badly samples the signal that appears parallel to 
that line. The figure with fewer correlation samples the pattern better. 

So a good distribution of indirect parameters is such that the projected distribu¬ 
tions all have most of their energy in high frequencies, and the joint distributions 
are as weakly correlated as is practical. The two-level sample generation algorithm 
above may be generalized to n levels for n different sets of parameters. 
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FlOlfftl 10.47 

(a) Uncorrelated joint distribution, (b) A linearly correlated joint distribution. The projected 
distributions are good in both cases, but the joint distribution in (b) is not very good. The big 
circles represent sample locations, and the small black disks are places where the signal is 1. 
Redrawn from Mitchell in Computer Graphics (Proc. Siggraph *91), fig. 1, p. 158. 


10.5.18 Discussion 

There are several ways to characterize the quality of the sampling patterns described 
above. Unfortunately, at the moment there is no final word on which pattern is best 
in all respects. 

One way to judge the pattern is to visually examine an average Fourier transform 
derived from many examples of each class and judge it with respect to an ideal 
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frequency distribution. Examination of the figures in the preceding sections shows 
that all of the patterns have a DC spike, a small clear zone around the spike, and 
then a sea of noise, which is roughly the “blue-noise” spectrum. Since we would like 
the clear zone to be as large as possible, one characterization is to rate the different 
methods on the radius at which the frequency response starts to become significant. 

Dippe and Wold advocate an analytic approach based on the power signal-to- 
ttoise ratio (SNR) [124]. This measure is proportional to the average number of 
samples per unit region of the pattern, and to the flat field response noise spectrum 
(FFRNS), which is the noise part of the flat field response scaled by the sampling 
rate of the pattern. Recall that the flat field response is the result of sampling a flat 
(or uniform) signal with a given pattern. Dippe and Wold analyzed the FFRNS for 
Poisson and jitter patterns. They found that low-frequency noise is reduced more by 
jittered patterns than by Poisson-disk patterns, and felt that this produced perceptu¬ 
ally better results when these patterns were used for sampling images. This approach 
has the advantage of providing a quantitative measure of sampling response, but the 
FFRNS can be difficult to interpret. 

Recall that when discussing Monte Carlo in Chapter 7, we mentioned that dis¬ 
crepancy was one way to measure how well a set of points are distributed over a 
domain. 

Intuitively, discrepancy measures the difference between the number of samples 
we expect in a given area and the number we actually find there. For example, 
recall that our prototype sampling tile is a unit square. Within any region R of 
area A(R), we can count the number of samples n within the region. If there are N 
samples uniformly distributed on the square, then we would expect the ratio of n/N 
(giving the percentage of points within the region) to be about the same as A(R) (the 
percentage of area of the unit square occupied by R). 

One definition of discrepancy due to Zaremba [397] is based on using rectangular 
regions with one corner at the origin and one corner at (a, 6), as in Figure 10.48. 
Then the discrepancy A(x,t/) is defined as the least upper bound of the difference 
between the estimated area ratio and the counted sample ratio: 

A(x, y) = sup \n/N — ab\ (10.14) 

A slightly more general definition due to Stroud allows the origin of the box to 
appear at any point (c, d), as in Figure 10.49. The definition is then 

A(x, y) = sup | n/N — (a — c)(b — d)\ (10.15) 

Shirley has pointed out [397] that we can define a generalized discrepancy that 
takes into account regions of different shapes and sizes. Below, we will see discrep¬ 
ancy based on disks and triangular and quadrilateral portions of a square. 

There is a surprising connection between the distribution of sample locations in 
the pattern and the quality of the estimated signal. Mitchell [309] has pointed out 



Calculating discrepancy for a box from (0,0) to (a, b). 



MOURI 10*49 

Calculating discrepancy for a box from (c, d) to (a, 6). 
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Two-dimensional discrepancies 

Process 

16 points 

256 points 

1,600 points 

Zaremba 

0.0358 

0.00255 

0.000438 

Jittered 

0.0489 

0.00633 

0.00160 

Dart-throwing 

0.0490 

0.00799 

0.00254 

N-rooks 

0.0461 

0.0101 

0.00391 

Poisson 

0.0932 

0.0233 

0.00932 


TABLI 10.2 

Two-dimensional discrepancies. Source: Data from Mitchell, Third Eurographics Workshop on 
Renderings table 1, p. 64. 


a theorem for ID signals that says, roughly, if a given set of samples xi,..., xn in 
the unit square have discrepancy D and their variance is bounded by V , then the 
difference between the average value of the evaluated samples and the real integrated 
value of the function is bounded by the product VD . In symbols, 


N 


N r 1 

«=1 J o 


fit) dt 


< VD 


(10.16) 


This result suggests that if we keep the number of samples in our pattern fixed, then 
as we lower the discrepancy on our sampling pattern, we improve our estimate of 
the signal by decreasing the error. 

Both Shirley [400] and Mitchell [309] present some numerical results that evalu¬ 
ated discrepancy for a variety of patterns and the magnitude of error in a variety of 
2D images sampled by those patterns. We present those results below; our comments 
follow those of the authors. 

Table 10.2 shows the discrepancy measured for five different types of patterns, 
using three different numbers of samples. For the pseudorandom processes, the re¬ 
ported value represents the average of 100 trials. The data is plotted in Figure 10.50. 

The data from Table 10.2 is useful because of the wide range of densities it spans; 
assume it represents the asymptotic behavior of the pattern as the number of samples 
increases. Recall that our main interest is for relatively low sampling densities, since 
we want to take as few samples as we can. But sometimes large numbers of samples 
are necessary, and in any case understanding the long-term behavior of a pattern can 
help us characterize it. 
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FIOURI 10.80 

The 2D discrepancy data from Table 10.2. 


The pattern that shows the worst asymptotic behavior is the Poisson (or purely 
random) pattern, which improves only with the square root of the sample size. 
The Zaremba process improves most dramatically, with the other techniques falling 
between these two. 

An alternative measure of discrepancy uses circular regions, or disks, rather than 
the axis-aligned rectangles used above. This shape seems to offer a more isotropic 
measure of discrepancy. The data measured by Mitchell for disks is shown in 
Table 10.3 and plotted in Figure 10.51. 

Again we see that Poisson patterns fare rather poorly. It is interesting that dart¬ 
throwing and jittered patterns outperform Zaremba’s pattern. 

One of the characteristics we want from a pattern is its ability to capture edges 
of all orientations. That is, do we get at least one sample on both sides of every 
edge, whatever its orientation? To test this quality, Mitchell created a set of 10,000 
random lines in the unit square [309]. He measured discrepancy using the region 
above each line. The data measured by Mitchell for disks is shown in Table 10.4 
and plotted in Figure 10.52. 

Zaremba’s pattern again performs the best, though jittered and dart-throwing 
patterns come in a close second. 
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Random-disk discrepancies 

Process 

16 points 

256 points 

1,600 points 

Dart-throwing 

0.0840 

0.0120 

0.00368 

Jittered 

0.0994 

0.0165 

0.00394 

Zaremba 

0.0855 

0.0160 

0.00511 

Poisson 

0.104 

0.0239 

0.00993 

N-rooks 

0.0908 

0.0224 

0.0104 


TABLI 10.3 

Two-dimensional disk discrepancies. Source: Data from Mitchell, Third Eurographics Workshop 
on Rendering , table 3, p. 65. 



PIOURI 10.81 

The 2D discrepancy data for circular regions from Table 10.3. 


The discrepancy values given above are useful in quantifying one characteristic of 
a sampling pattern, namely how uniformly the points are distributed in the tile. They 
also offer some experimental support for the jittered and dart-throwing patterns, 
which have justifications from the signal-processing point of view. 

But discrepancy does not describe the quality of an image generated with that 
pattern. For example, Mitchell notes that the Zaremba pattern leads to moire 
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Ranc 

lorn-edge disci 

epancies 

Process 

16 points 

256 points 

1,600 points 

Zaremba 

0.0504 

0.00478 

0.00111 

Jittered 

0.0538 

0.00595 

0.00146 

Dart-throwing 

0.0613 

0.00767 

0.00241 

N-rooks 

0.0637 

0.0123 

0.00488 

Poisson 

0.0924 

0.0224 

0.00866 


TAIL! 10.4 

Two-dimensional edge discrepancies. Source: Data from Mitchell, Third Eurographics Workshop 
on Rendering , table 4, p. 66. 



FIOURI 10.52 

The 2D discrepancy data for edges from Table 10.4. 
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Process 

Checkerboard 

Angled checkerboard 

Sine 

N-rooks 

0.0208 

0.0231 

0.0517 

Jittered 

0.0260 


0.0486 

Dart-throwing 

0.0271 


0.0423 

Regular 

0.0315 

0.0252 

0.0172 

Poisson 

0.0359 

0.0355 

0.0679 


TABLI 10.8 

Average pixel errors for different patterns. Source: Data from Shirley in Proc . Eurographics *91 , 
table 3, p. 189. 


patterns, and jittered sampling produces images that are grainier than dart-throwing 
patterns. We conclude that discrepancy is one measure for checking the quality of 
a proposed sampling pattern, but it is not the primary criterion to be used when 
choosing a pattern. 

In [400] Shirley measured discrepancies for a variety of sampling patterns; the 
values closely match those given above. But Shirley also applied these patterns to 
a set of images and evaluated the average pixel error with respect to a high-quality 
reference image. A measure of pixel error such as this does not tell us anything 
about the distribution of the error; we cannot tell if the error is uniformly distributed 
throughout the image in the form of noise, or organized into highly structured 
artifacts. This is a difficult issue in any case, because the presence of structured 
aliasing errors depends as much on the underlying continuous image as the sampling 
pattern that can beat (that is, periodically interact) with it. But pixel errors do give 
us some idea of how close the final image is to the original in an absolute sense. 

We summarize Shirley’s measurements in Table 10.5. Three scenes were eval¬ 
uated. One was a checkerboard receding into the distance, where one axis of the 
board was aligned with the x-axis of the image plane; thus each scan line cut across 
a single row of squares. The second was an angled checkerboard rotated 45° to the 
first, resulting in sighting down diagonals toward the horizon. The third pattern was 
a sine wave (sin(r 2 )), where r is the distance of a pixel to the image origin; this is a 
set of smoothly varying rings that become thinner and more tightly packed as they 
spread from the origin. We give Shirley’s results for reconstruction with a triangular 
filter. The message is mixed, but it appears that for this measurement of quality and 
these images, the jittered and dart-throwing patterns are the best overall performers. 
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10.6 Roflnomont 

After the initial sampling is completed, you may decide that certain regions of the 
domain need closer examination. This decision may depend on a variety of factors, 
but it is usually based on the idea of increasing the local sampling frequency in 
regions of high local bandwidth. Estimation of the local bandwidth is usually done 
implicitly, rather than by calculating a Fourier or wavelet transform. In general, the 
values of some samples in a neighborhood are tested against some criteria. Some 
additional geometric and structural information about the scene and the image may 
be included in the test, if it is available. Based on the results of those tests, some new 
sample locations may be generated in that region and then evaluated. Typically the 
tests are then repeated on the new samples, and the process repeats until the tests are 
satisfied or some upper limit on the number of repetitions has been reached. 

The general idea is usually based on the expectation that the signal will eventually 
be reconstructed from the sampled data. When the data is uniform, or homogeneous, 
in a given region, then we assume that we have completely characterized the signal 
in that region. For example, a simple reconstruction will use the mean value in a 
region as the value for all points in the region. Thus, the approach will be to make 
sure that all the samples in a given region are “similar” in some specified way, so 
we can feel that we have captured what is happening in a region of the signal. If 
a region is nonuniform, or heterogeneous, then we will typically want to refine our 
regions until each subregion is uniform. 

In this section we will survey several refinement methods that have been proposed 
in the literature. In general, each method first performs a sample selection , which 
identifies which samples participate in the test. Then a refinement test is applied to 
those samples, which normally involves several criteria. 

Typically when the test is satisfied, that success indicates that the samples represent 
a good estimate of the signal in that neighborhood. We sometimes see two slightly 
different approaches in the literature, the pessimistic and the optimistic. The names 
are assigned based on the pessimist’s idea that more samples should always be taken 
unless one is explicitly told to stop; the optimist assumes that the samples gathered so 
far are enough and only takes more if necessary. The pessimistic approach assumes 
that the initial estimate is incomplete. This means we should take ever more samples 
until the test is satisfied. From this point of view, we say the test implements stopping 
criteria , since the default is to take more samples. The optimistic approach assumes 
that the samples being tested are a good estimate, and more samples are taken only 
if the test fails; in this case the test implements refinement criteria . Both cases boil 
down to the same thing, but the discussion is slightly different depending on our 
point of view. For consistency, in this book we will adopt the optimistic approach. 
We will present all the tests so that they report acceptance : if the test fails, it indicates 
that more sampling is necessary. 
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Many of the tests are described in the literature with respect to “pixels.” Typically 
these authors are implicitly using the fact that when an image is resampled onto a 
pixel grid, the values near the center of pixels usually get weighted more than those 
farther away from the center. In other words, they know that the low-pass filter step 
(carried out by convolution) and resampling are going to be rolled together; so they 
can implicitly carry out importance sampling and consider the centers of pixels to be 
more important than the edges. If this approach is taken, we must be extra careful 
about reconstruction. We will return to this in the reconstruction section below. In 
general, if we know the resampling locations at the time of the original sampling, 
this information can be used to our benefit. Since adaptive refinement of sampled 
signals is appropriate for applications other than estimating image intensity at the 
image plane, we will refer to resample locations rather than pixels. 

Some of the tests select the samples to be used based on how they are expected to 
be arranged. For example, it is not uncommon to examine the four corners of a pixel 
or the centers of neighboring pixels. Such tests only make sense in their original form 
if the samples fall in the expected locations. Most of these special cases are based on 
image sampling where square pixels are expected; they are gathered together below 
in one section. 

When the refinement test fails and more samples are needed, we must decide 
where they should go. Some algorithms generate new samples on the fly, usually on 
the basis of some subdivision scheme among the samples used in the test. Others use 
more samples from a precomputed pattern such as those presented earlier. We will 
show the different choices used by different algorithms. 

It is also worth noting that when one region needs refinement, an adjacent region 
will often also need more samples. If the regions overlap, then some of the new 
samples created for the first region may be useful in the second, if the test and 
new-sample geometries coincide. 


10*6*1 Saipl# Intensity 

Many tests are defined in the literature in terms of the intensities of sample points. 
This is appropriate when the signal is any multidimensional vector quantity, though 
the term evokes a gray-scale image. Because the idea of intensity comparison is so 
useful in describing refinement tests, we pause for a moment to interpret this term 
for different situations. 

We assume for the discussion of refinement tests that the value of each sample 
s is an n-dimensional vector: s € 7l n . This may stand for any abstract quantity. 
It is particularly useful when the sample represents a color in some color system. 
Different color systems use different values of n. If a system evaluates the image 
color as a full spectrum, n might be 30 or more, one for each wavelength. If the 
system represents colors with RGB or XYZ descriptions, n would be 3. For a 



10.7 Rtfintmtnt Ttitf 


465 


black-and-white gray-scale image n would be 1. If the system evaluates each spectral 
wavelength independently, n would be 1, but it would have an associated wavelength 
attached to it. 

Comparison among these different color representations is traditionally handled 
in a rather cavalier fashion. Two RGB colors A and B are frequently compared 
using either the £°° norm or the C 2 norm: 

C°°(A, B) = max(| A r - B r |, | A g - B g \,\A b - B b \) 

C 2 (A, B) = \J{A t - B r )2 + (A g - B s )2 + (A b - Bfc) 2 (10.17) 

although a comparison in L*a*b* or L*u*v* color space is probably more appropri¬ 
ate. Alternatively, the color components may be compared against different thresh¬ 
olds, or the difference may be found in some other color space. The correct interpre¬ 
tation of terms like “similar” and “different” when applied to two or more sample 
values depends on the context, which provides an interpretation of sample values. 

From here on, when we mention “intensity,” and discuss whether two or more 
“intensities” are “similar” or “different,” we mean these terms to stand for any of 
these interpretations, depending on what is appropriate in context for that signal. 
The thresholds for similarity are also dependent on this context. 


10.7 Refinement Tests 

We distinguish refinement tests into five general categories. The categories are distin¬ 
guished by the type of information used in the test. The five types of tests are inten¬ 
sity comparison , contrast , object-based , ray-tree comparison , and intensity statistics . 
These different types of tests are discussed in order below. 

Each type of test typically involves several samples. We will refer to the collection 
of samples used in any test as the study set S of samples for that test, containing 
n samples {s 0 , si,..., s n -i} with corresponding values {v 0 , v\, ..., u n -i}- In any 
set S', we define the minimum and maximum values to be 5 m i n and 5 max . Notice 
that more than one sample can have a value corresponding to the minimum and 
maximum. We also define the mean of all the samples 5 as 

s = ^J2 Vi < 1(U8 > 

i=l 

10.7.1 Intensity Comparison Refinement Tost 

The simplest form of adaptive refinement test compares the intensity of the samples 
in the study set. 
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if ^max ^rnin > € 
then fail 
else succeed 
endif 


FIGURE 10.S3 

Adaptive sampling from Whitted [477]. 


Intonsity Dllioronco 

In the first paper on adaptive point-sampling, Whitted stated that when considering 
four samples that form a small square in the sampling domain, “If the intensities 
calculated at the four points have nearly equal values and no small object lies in the 
region between them, the algorithm assumes that the average of the four values is a 
good approximation of the intensity over the entire region” [477]. We will return 
to the “small object” idea below. Pseudocode for the intensity clause of this simple 
algorithm is shown in Figure 10.53. 

The value of e in Figure 10.53 is user-defined. Whitted offers no advice in its 
selection, and indeed in different contexts different values will be most appropriate. 
When this method is used at the image plane for sampling an image function, a 
common rule of thumb is that if you are going to display on a frame buffer with 
eight bits of RGB color specification, then e « 1/2 8 = 1/256 is in the right ballpark. 
Note that the number of bits used here is the depth of the color value, not the color 
identifier. So if a frame buffer is eight bits deep but each entry points to a colormap 
entry with twelve bits per color component, we would choose e « 1/2 12 = 1/4096. 


Intensity Orsups 

A similar test is used in Jansen and van Wijk [228]. In this test, the min and max 
of a set of values are compared against a reference value t ; again, if the difference is 
too large, the test fails and the sample set should be refined. This method is shown 
in Figure 10.54. 

Jansen and van Wijk present this method in the context of a recursive refinement 
algorithm that subdivides a region of a domain over and over. They state that 
although the value of e may be held a constant, computation time may be saved if e 
is increased as the recursion level increases. This makes the first steps of the recursion 
more important than later steps. When there are many regions to be refined at once, 
this will help enforce a breadth-first refinement of the domain, where most regions 
are subdivided a bit before some regions become highly refined. They reported good 
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if (l-Smax — <| > e) or (|5 mm ~t\>e) 
then fail 
else succeed 
endif 


PliURI 10.84 

Jansen and van Wijk’s test. 


results using the sequence of (0.0,0.05,0.15) for e to generate smooth-shaded ray-cast 
images. 


1 0 . 7.2 Contrast RtfinsMSiit Tost 


Mitchell has pointed out [307] that contrast is a good predictor of the response of 
the eye to variations in light intensity. One definition of contrast for a study set S is 
defined by 


C = 


Smzx Srr 
Smax “I" *Srr 


(10.19) 


(other definitions are given in Section 3.3.3). This is a good heuristic to use for 
evaluating image functions intended for viewing by human observers. When contrast 
is used in a system that samples in red, green, and blue, Mitchell observed that three 
different contrast values may be used for each of these color bands. This allows the 
system to give more weight to the green component of the signal, to which the eye is 
sensitive, less to the red, and still less to the blue. He reported good results with red, 
green, and blue contrasts set to (0.4,0.3,0.6), respectively. 

This type of ratio test is not appropriate when pixel values are all zero or very 
small. The uniformity condition is easily tested and indeed must be tested for or one 
will divide by 0. When pixel values are small in magnitude, the test can be overly 
sensitive. If S max = .1 and S m [ n = .3, then C = 1/3, which is the same result if 
Smax = 01 and 5mi n = 03. We probably want to trigger sampling in the former case 
but not the latter. One way to distinguish these is to multiply the contrast by the 
mean, using C' = CS rather than C. This would give C' « 6/1,000 in the former 
case and C f « 6/100 in the latter case. 

When we are sampling the image plane, contrast then seems like a reasonable 
metric to control refinement, though it is of less value for other functions such as 
illumination signals. 
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if obji ^ obj 2 
then fail 
else succeed 
endif 


FIGURE 10.88 

The Roth test. 


10.7.3 Ob|#ct-Based RoftnoMomt TnI 

Rather than examine just the intensity values of samples, we can use additional in¬ 
formation carried by the samples that is unique to a computer graphics environment. 
In particular, for image and illumination signals, every sample may be characterized 
as a ray that strikes some particular object (we assume the scene is enclosed in a 
bounding “background” object that can be uniquely identified; a ray that passes out 
of the scene may be assumed to strike this object). This conceptual characterization 
holds true whether or not the samples are in fact evaluated by ray-tracing methods. 
We will call the object number associated with a given sample the “object tag” for 
that sample. 


ObiGct-DiffffGrGncG Test 

Roth suggested locating edges adaptively by comparing the object tags for adjacent 
samples [361]. New samples are created and evaluated between the two samples, 
and the process iterates until the distance between the two differing samples is below 
some threshold. This simple algorithm is outlined in Figure 10.55. 

The primary advantage of this approach is that it does not require any user- 
specified thresholds (except a recursion limit). The primary disadvantage is that it is 
only sensitive to changes in object tag. If a single object has varying characteristics 
(due to texture, high-curvature geometry, or surface finish), then this approach will 
not capture the effect of those variations on the signal. 


Faur-Lavai Test 

This idea was pushed a step farther by Argence [12], who noted that sometimes a 
single object is broken up into many smaller patches during the modeling or rendering 
phases. Many systems distinguish an “object number;” which is consistent for an 
entire continuous surface, from a “patch number;” which starts at one for each 
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A problem with minimal bounding spheres for hit detection. 


surface and enumerates the derived patches. Thus, an object/patch pair acts like a 
major and minor identifier. Argence requires that patch numbers also be the same 
for the refinement test to pass. 

Two more criteria are included in this test. The first is a check that the object 
pointed to by the object tag and patch number is illuminated by the same light 
sources. If the point visible from one sample is not illuminated by the same set of 
lights as a neighboring sample, then there is a change in illumination and shadow 
between those samples, and Argence finds this phenomenon worthy of refinement. 

There is also a test for small objects. This type of test was first suggested by 
Whitted [477], who enclosed each object in a sphere with a radius chosen to guar¬ 
antee that the sphere covered at least one screen pixel. If an eye ray intersected that 
bounding sphere but did not hit the object, then the four subsquares sharing that 
ray as a common vertex were subdivided to “look” for the object. Roth has pointed 
out [361] that this approach has a problem for long, skinny objects, as illustrated in 
Figure 10.56. A ray may hit the bounding sphere, but none of the squares around it 
intersect with the projection of the object. Thus the refinement algorithm will never 
find the object, no matter how far it refines. 
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if (obji ^ obj 2 ) 
or (patchi ^ patch 2 ) 
or (lightingi ^ lighting 2 ) 
or (small-object-detected) 
then fail 
else succeed 
endif 


The Argence test. 


Argence includes a similar test, but only builds and uses the minimal sphere if the 
object’s projection is smaller than one intersample distance (e.g., a pixel width). This 
removes the problem demonstrated by Figure 10.56, but it also means that the long 
shape shown in that figure could be completely missed by the algorithm. Pseudocode 
for the Argence method is shown in Figure 10.57. 

The “small object test” was extended and made more robust by Thomas et al. 
[436]. They suggested that refinement should occur near edges in the function 
being sampled. When the function is the image, important classes of edges are the 
boundaries of objects, as mentioned above for Roth’s method. Thomas et al. place 
covers around each object to help detect edges. A cover is a unique pair of surfaces 
associated with each object. One surface encloses the object, the other is enclosed 
by the object. The thickness of the covers is chosen to guarantee that at least one 
sample point will pierce each cover. Using covers projected to the image plane, you 
can determine if there is an object boundary between any given pair of pixels. Then 
an algorithm similar to Figure 10.55 may be used to trigger refinement in those 
regions. Like Roth’s method, this technique is good for finding boundaries but will 
not detect other phenomena that can cause high-frequency information in the signal. 


Obioct-Covnt Tost 

A similar approach for image functions was described by Hashimoto et al. [196]. 
Using projections of silhouette edges onto the image plane, they are able to identify 
the number of different objects in a given region. If there are more than two objects 
in the region, refinement is triggered, as in Figure 10.58. 
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if #objects > 2 
then fail 
else succeed 


M4URI I 0.81 

Hashimoto’s test. 



M4URI 10.89 

All the samples have the same value. 


Moan -Distance Tost 

To catch high-frequency information caused by texture, van Walsum et al. proposed 
texture-space measures associated with each set of samples [450]. They note that 
we typically know a lot of information about textures, particularly those stored in 
tables or images, and that this information may be used to improve our sampling 
quality. For example, Figure 10.59 shows samples of a black-and-white texture that 
are all white. There is no way to know, simply from looking at the sample values, 
that the texture is in fact not entirely white. 

We are lucky when dealing with textures because we can gather a complete picture 
of what is happening between these texture samples. They proposed three criteria to 
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use for refinement. The first is uniformity : the texture values between two samples 
are examined for variation; more than a certain amount is grounds for refinement. 
The distance criterion measures whether the texture locations represented by two 
samples are farther away than some threshold. The third measure, called the filter 
criterion, estimates whether a prefiltered local average of the texture is close to the 
texture value attached to that sample. These three tests may be used together or 
independently, with the appropriate values for each threshold, van Walsum et al. 
report their results for a variety of thresholds for some test images; the uniformity 
criterion seemed to be the best performer for the high-frequency texture they studied. 


Cook's Tost 

Cook switches to a higher level of refinement in a region if any object moves more 
than eight pixels horizontally or vertically throughout the duration of that frame 
[ 101 ]. 


10.7.4 Roy-Troo Comparison Rofftnomont Vssft 

In a ray-tracing environment, each sample may be represented by a ray tree , which 
gives a complete object-intersection history encountered by a screen sample repre¬ 
senting a ray of light (ray tracing is discussed in detail in Chapter 19). The trees of 
neighboring samples can be compared for similar structure or content. 

The method of Argence described above [12] can use this mechanism to compare 
lighting. Recall that subdivision is called for if two samples fall on the same patch 
of the same surface, but the illumination on the intersected points is different. One 
way to check this latter condition is by examining the ray tree associated with each 
sample. 

The refinement test used by Akimoto et al. [6] tests the entire ray tree against 
a number of criteria. Like Jansen and van Wijk, Akimoto et al. know where the 
resamples will be located in the domain. The goal is to determine not if more samples 
need to be taken, but rather which of the already placed samples should be evaluated. 
They distinguish four levels of evaluation for each sample, based on its neighbors. 
The classifications are shown in Figure 10.60. 

In level 1, the ray trees associated with the selected neighbors are structurally 
different. Thus, we cannot say anything about the similarity of the samples or 
assume anything about the new samples; they must be fully evaluated. In level 2, the 
trees are structurally similar, but the lighting information is different. This is similar 
to the test used by Argence, but it includes lighting information through the entire 
ray tree, not just the top level. In this case the new sample inherits the common ray 
tree of its neighbors, but the shadow information is recomputed. In the third case, 
the trees are equivalent and the shadow information is the same, but some of the 
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Level 

Test 

Action 

1 

Trees unequal. 

Ray-trace the sample. 

2 

Trees equal but shadows differ. 

Ray-trace only the necessary samples. 

3 

Trees equal and shadows equal but 
textures present or excessive intensity 
difference. 

Recompute shading information for 
samples. 

4 

Trees and shadows equal, no textures 
present, and intensities sufficiently similar. 

Interpolate sample value. 


The four refinement levels for ray-tree refinement. 

if (max(|vi - S\) > e) 
then fail 
else succeed 


Adaptive sampling from Akimoto et al. [6]. 


surfaces have a texture upon them, or the values of the samples exceed a threshold. 
Akimoto et al. compare the largest distance of each intensity from the mean, as 
in Figure 10.61. In this case the new sample can again inherit the ray tree of its 
neighbors, but the shading at each node needs to be recomputed. The advantage 
here is that the visibility problem does not need to be solved again. The fourth case 
identifies when the neighbors are sufficiently similar that the new sample may be 
estimated from its neighbors by interpolation. 

The intensity comparison tests described in this section are summarized in Ta¬ 
ble 10.6. 


10*7*5 Intensity Statistics Rtfincncnt Test 

Several researchers have investigated refinement tests based on statistical measures of 
the values of samples in a neighborhood. These are typically based on the “intensity” 
values of the samples. 
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Name of test 

Test 

Parameters 

Reference 

Intensity difference 

‘S'max — ‘S'min > A/ 

A 7 

[477] 

Intensity groups 

(I'S'max — t\ > A I) Or (|*S m in ~ > A/) 

A / 

[228] 

Object difference 

obji ^ obj 2 

— 

[361] 

Four-level 

(obji ^ obj 2 ) 

or (patchi ^ patch 2 ) 

or (lightingi ^ lighting 2 ) 

or (small-object-detected) 

— 

|12] 

Object count 

#objects > n 

n =number of objects 

[196] 


max(|t>i — S|) > A / 

A / 

[6] 


TAG LI 10.6 

Intensity difference tests. 


SNR Tost 

Dippe and Wold [124] proposed computing the signal-to-noise ratio (SNR) of a set 
of samples. One conceptual model of the SNR of a signal is that it measures the 
degradation of a perfect signal over an imperfect communications line. The line 
introduces noise into the otherwise accurate signal; the SNR measures the extent of 
this degradation. They note that the quality of the SNR estimate can depend on 
the number of samples used in its computation; if there are only a few samples, the 
confidence in the estimated SNR is low. 

Dippe and Wold observed that root-mean-square signal-to-noise ratio (RMS 
SNR) is equal to the square root of the sampling rate times a constant, which is 
based on the spectrum of the signal and the filter used to reconstruct the signal. It 
is usually the case in graphics that we don’t know the spectrum of the signal we’re 
sampling, but we usually need to assume it can contain very high frequencies. 
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Variance Test 


Lee et al. [261] suggested using the variance of a study set to estimate its accuracy. 
The basic idea is that as the variance diminishes, the samples are more consistent, 
and it is increasingly likely that the samples are a good estimator of the signal. 

The variance after N samples is estimated by cr/v, which is found from 

= ( 10 . 20 ) 

i —1 


where S is the mean of the samples: 


S = 


i N 


1=1 


( 10 . 21 ) 


To test the quality of a sample set, we ask if the variance <jh of the set is below some 
threshold T. Now recall that cr/v is just an estimate of the true variance of the signal, 
so it is necessarily imprecise. So we introduce a slightly less precise test, and check 
to see if the probability that the variance is less than T is within some probability 
tolerance 0. 

The test is set up by defining a real number € TZ so that 


prob 


Ng n 2 

var(S) 


<xl(N- 1 ) 


= 0 


( 10 . 22 ) 


In words, the test succeeds if the estimated variance is probably less than the constant, 
where “probably” means the test has a chance of 0 of being right. 

To implement this we need a value for x% Normally this value is obtained by 
looking it up based on 0 and (N — 1) in a table of statistical values, though chi-square 
values are often easily available using symbolic mathematics programs. The choice 
of T and 0 can be made based on the maximum number of samples we are willing 
to take. Suppose M is the highest variance we expect in the scene, and we want to 
take a maximum of Z samples. Then we want the variance test to succeed after Z 
samples when the maximum variance has just been reached; this happens when 


T = 


M 

xl(Z-l) 


(10.23) 


Lee et al. suggest precomputing a table of values for Txp(N - 1) for N = 1,2,..., Z. 
They set the maximum variance to M = 1/128 based on their frame buffer’s color 
resolution, and Z — 96 based on desired run time. Then 3 = 0.05 and T — 0.000105. 

To implement the test after N samples, first compute v.x 2 using Equation 10.20, 
and then evaluate 



(10.24) 
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If G < T, where T is the variance threshold, then stop sampling. Otherwise draw 
another sample and repeat the test. The probability of stopping too early is 


var(S) 

~N~ 


(10.25) 


which by construction is less than 0. 

Confidence Test 

Purgathofer presents a test based on a confidence interval [351], To test for refine¬ 
ment, we estimate the likelihood that the mean of the current samples is within a 
certain tolerance of the accurate mean of the signal in that neighborhood. The user 
supplies the tolerance 2 1 and probability a; for example, 99% certainty corresponds 
to a = .01. The probability P that the current mean of n samples is within w on 
either side of the accurate mean is given by the t test with (n — 1) degrees of free¬ 
dom. Refinement is triggered if the desired probability P is less than the estimated 
probability p : 

P<P 

or equivalently, 

*i_ a , n _i-^= < w (10.27) 

y 71 

where o is the standard deviation for this group of samples, from Equation 10.20. 

Purgathofer reinforces Dippe and Wold by stressing that when there are only a 
few samples (i.e., n is small), then the test of Equation 10.27 will give inaccurate 
results. We must always begin with “enough” samples to make the test meaningful 
before it can be used. To determine how many samples are enough, Purgathofer 
gives an elegant argument for the worst case: an asymmetric bimodal distribution. 

Suppose that we are sampling an image in a square pixel, and a vertical edge 
divides that image into two unequal pieces, the left one larger and filled black, the 
right one smaller and white, as in Figure 10.62. The black signal has value 0, the 
white signal value 1. We assume that the box has area 1, and the left half has area d ; 
the right then has area 1 — d. 

Suppose we have terrible luck and every sample we evaluate lands on the white 
region; our reported value will then be 1, rather than d. The test will be valid as 
long as we can get at least one value from both domains, so we want to know the 
probability that we will hit one of each domain after n samples. We generate n 
independent random samples Ui. The probability of each one landing in the smaller 
zone is (1 - d), and because they are independent, the probability that they will all 
land in that zone is 


a -dr 


( 10 . 28 ) 
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PI4URI 10.62 

A pixel split into a black left and white right. 


d 

0.025 

0.050 

0.075 

c 

0.100 

* 

0.125 

0.150 

0.175 

0.200 

0.004 

402 

435 

474 

519 

575 

647 

748 

921 

0.025 

64 

69 

75 

83 

91 

103 

119 

146 

0.050 

32 

34 

37 

41 

45 

51 

59 

72 

0.075 

21 

23 

25 

27 

30 

34 

39 

48 

0.100 

16 

17 

19 

20 

22 

25 

29 

36 

0.125 

13 

14 

15 

16 

18 

20 

23 

28 

0.150 

10 

11 

12 

13 

15 

16 

19 

23 

0.175 

9 

10 

10 

11 

12 

14 

16 

20 

0.200 

8 

8 

9 

10 

11 

12 

14 

17 
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Values of n given d and a. 


We want this probability to be less than our desired (1 - a), so (1 — d) n < 1 — a, or 


log(l - a) 
~ log(l - d ) 


(10.29) 


The value of n for values of 0.8 < a < 0.99 and 0.01 < d < 0.2 is shown in 
Figure 10.63. Some values of n are tabulated in Table 10.7. 

Notice how quickly n grows as the confidence increases and the interval decreases. 
Figure 10.64 shows the value of n for different choices of a when d = 1/256 « 0.004. 




















MtURI 10.63 

The value of n versus a and d from Equation 10.29. 



MOUftl 10.64 

Values for n when d — 1/255 « 0.004. This is a log-linear scale; the horizontal axis is linear in c*, 
the vertical is logarithmic in n. The lower-left value is for a = 0.01, n = 3; the upper-right value 
is q = 0.99, n = 149. 
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Notice that if we take eight samples in a pixel, we can only estimate that the mean 
of these samples will be within 1/256 of the true mean with 0.03% certainty. 

Purgathofer remarks that this worst-case analysis is necessarily pessimistic, and 
that we can almost always get away with fewer samples. Stratified sampling, for 
example, is particularly effective for making sure that much of the domain gets 
sampled; in the black-white example above, if our eight samples were distributed 
evenly in the domain, we couldn’t help but hit both regions. Stratified sampling 
doesn’t help on all signals, but by using it we can often get away with a smaller study 
sample than Equation 10.29 would require. 

t Tost 

Painter and Sloan [328] also use a confidence test, based on the number of samples 
n, their variance v = cr 2 , and a desired confidence level (1 — a). They also use the t 
test and compare it to a threshold T: 

t a/2 » < T (10.30) 


Soqvontiai Analysis Tost 

The theory of sequential analysis is used by Maillot et al. [280] to guide their 
sampling. They refer to a test developed by Wald called the sequential probability 
ratio testy or SPRT. The test requires a measure of homogeneity, or smoothness, of a 
region. They suggest that the range of intensity values Si with respect to their mean 
5 is a good measure of homogeneity. The test may be phrased in terms of random 
samples within the neighborhood represented by this study set. If we pick a random 
sample location y in this neighborhood, we would like to find the probability p that 
the sample’s value f(y) is within e of the mean 5. In symbols, 

p=P(\f(y)-S\<e) (10.31) 

If e is small and p is small (say less than some threshold po ), that suggests that the 
region contains a lot of variation, and requires refinement. If p is large (say above a 
threshold pi), then the region is heterogeneous and only a few samples are required. 

To use this test requires an estimate of 5. Since this value is unknown, in practice 
Maillot et al. use the mean of the first few samples evaluated in the neighborhood. 
The test then proceeds to determine if the signal if homogeneous , and thus smooth, 
or inhomogeneous , and thus requiring refinement. If neither answer can be stated 
clearly, new samples are drawn until one of the two decisions is accepted or an upper 
limit is reached. Maillot et al. note that using the first few samples introduces some 
bias, but they did not analyze the result of that bias. __ 

The test runs by first taking a pilot sample set and determining the mean S from 
it. Then for each sample we decide if it is within the interval of half-width e around 
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Name of test 

Test 

Parameters 

Reference 

SNR 

SNR > T 

T = maximum SNR 

[124] 

Variance 

S 2 /x}(N - 1 XT 

T 

|261] 

Confidence 


w, a 

[351] 

t test 

ta/2 v < T 

T 

[328] 

Sequential analysis 

P(l/(y)-3|<c)<pr 

£,PT 

[280] 


TASLI 10.1 

Intensity statistics tests. 


that mean. If a large percentage of the points is within the mean, then the region 
is assumed to be homogeneous, and sampling can stop. If a large percentage is 
outside the mean, the region is heterogeneous and again sampling may stop because 
it is assumed that the current set of samples is representative. If the percentage is 
between these two extremes then more samples are drawn. Maillot reports that 
Po and pi should be chosen based on the perceptual qualities of the human visual 
system to distinguish homogeneous regions, and on the fact that good results have 
been obtained with po = 0.7 and p\ = 0.9 [279]. 


Summary 

Table 10.8 summarizes the tests in this section. They all perform reasonably well 
in practice for basic anti-aliasing, but they all are based on user-set parameters that 
may be difficult to select. 


10.8 Refinement Sample Geometry 

One set of techniques is based on refining the estimate of a signal based on increasing 
the local density of a fixed sampling pattern. We call this “predictable” geometry, 
since we can state the location of every potential sample before the sampling process 
even begins. These techniques merely evaluate samples at predefined locations. 

An alternative set of methods is based on “unpredictable” geometry. These are 
generally the result of random processes that place samples in arbitrary locations. 

Between these two extremes are those nonuniform patterns that are derived from 
one or more stored templates. Theoretically, we could enumerate all possible sample 
locations derived from these templates. These are usually meant to increase the 
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efficiency of unpredictable sampling. In practice, it is easier to classify template 
methods with the unpredictable patterns, since the geometry is closer to that of the 
unpredictable patterns than that of the predictable ones. 


10.9 Refinement Geemetry 

The refinement test typically only indicates when more samples are needed in a 
neighborhood; it does not indicate where those samples ought to go. In this section 
we review various proposals for the placement of new samples. 


10.9.1 Unmar Bisection 

One class of refinement methods restricts attention to the straight line between two 
samples. Both of the methods we will examine work on a square grid that is used 
for image sampling and is therefore identified with the pixel grid. The methods look 
for borders between homogeneous regions along a line between either pixel centers 
or pixel corners. 

The linear bisection looks for single edges that occur between two objects seen 
by adjacent samples [361]. The approach examines the object represented by each 
sample. The algorithm assumes that if two horizontally or vertically adjacent samples 
represent different objects, then exactly one edge intersects the line between the 
samples, and that edge is shared by those two objects. Figure 10.65 shows examples 
of situations that satisfy and violate this assumption. 

The linear bisection algorithm iterates until the edge is trapped to within some 
fraction of the distance between the samples. The error in the algorithm when its 
assumptions are fulfilled is the amount by which the area measures are off. If we 
assume that all edges are linear; then a worst case is shown in Figure 10.66, where 
an edge passes through two pixels almost parallel to the line between their centers. 
The bisection routine will trap the edge near the line between the pixels and assign 
the left one a color of black and the right one white, though both should be almost 
the same shade of gray. 

This technique was used by Roth, who associated samples with pixel centers 
[361], He does not give details on how to reconstruct the signal from this informa¬ 
tion, though its use for line drawing is mentioned. Roth used the object-difference 
test for refinement. 

Wyvill and Sharp use this method when samples are associated with pixel comers, 
which allows them to use a central-star reconstruction technique [491]. This is 
illustrated in Figure 10.67. 

They reconstruct and resample the signal in one step by assuming that the re¬ 
sample grid is composed of squares with their vertices at the samples. The center of 
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FI O If RI 10.66 

(a) The assumption made by the linear bisection algorithm is satisfied here; the objects in the samples 
are separated by one shared edge, (b) The assumption is violated by this nonlinear edge. 



F I O U R I 10.66 

A worst case for linear bisection, (a) The edge is almost perpendicular to the vertical edge between 
the pixels, (b) The correct answer is a light shade in both pixels, which are each about half-covered, 
(c) The result of linear bisection, which assumes the edge is oriented vertically. 
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Corner refinement and star reconstruction. 


gravity C is found by averaging all points pi on the sides of the square where edges 
have been inferred; because of the assumptions there is at most one such point on 
each edge. Then a star is drawn from C to each point p*; these are assumed to be the 
boundaries of the colored regions. The signal is reconstructed to have a flat value 
across the pixel, with a value given by the estimated mean 5. This mean is estimated 
by summing the product of the intensity U of each region with its weight equal 
to its area: 

S&^wJi (10.32) 

i 

The signal is then resampled at the pixel center, yielding the estimated mean for use 
as the pixel value. 

A similar subdivision method is used by Hashimoto et al. [196]. They begin 
by projecting the silhouette edges of all objects onto the image plane. The plane 
is initially subdivided into a number of large cells, each enclosing many samples 
ultimately intended to be pixel centers. If there are more than two regions in a cell, 
the cell is subdivided at its midpoint into four subcells; here a region is a contiguous 
area due to any single object, or the background. 

When a square region contains no more than two regions, subdivision stops 
and a few samples are evaluated. The samples chosen are those on the corner of 
the cell and two pairs that straddle the expected location of the edge, as shown in 
Figure 10.68. This algorithm needs to use the projected edges both to estimate the 
number of objects in a cell and to determine which samples straddle the edge on the 
cell boundaries. 

The refinement test used by Hashimoto et al. is based on object count within a 
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PIGURI 10.68 

/N/WWW ~ /S/S/W 

Samples to be evaluated are marked in black, (a) The initial cell. It contains more than two regions, 
so it is subdivided, (b) The upper-left cell contains only one region, so only the comers are selected, 
(c) The upper-right cell contains two regions, so the edge is captured between samples, (d) The 
lower-left cell also contains two regions and captures an edge. Note that the lower-left corner serves 
as one of the two samples that trap the edge, (e) The lower-right cell contains too many regions; 
this cell will be subdivided further. Redrawn from Hashimoto et al. in New Advances in Computer 
Graphics , fig. 4, p. 554. 
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region, coupled with linear bisection along the edges of the smallest cells to trap the 
known discontinuity between objects. 


10.9.2 AtmBImcHm 

An area bisection test was proposed by Whitted [477] to refine complex regions 
enclosed by four first-level samples; these were identified with pixel corners. The 
refinement test is applied to these four corners of each box. If more samples are 
needed, the square is subdivided into four equal quadrants, five new samples are 
placed and evaluated, and the test is reapplied to each new subsquare, as in Fig¬ 
ure 10.69. This form of binary subdivision can in theory repeat forever in some 
special cases (such as a fractal dust cloud), but in practice it is usually halted at some 
large (but arbitrary) number of recursions. 

Two different forms of isosceles triangular subdivision have been studied by Shu 
and Liu [405]. They have looked at subdivision based on right isosceles triangles 
and symmetrical isosceles triangles, as shown in Figure 10.70. 

Each stage of subdivision introduces three new samples, located on the edges of 
the old cell. Each new sample is shared by two of the larger cells. They present 
an analysis that suggests that a triangular cell subdivision will usually require less 
samples to produce an image of equal quality to a square cell. This is to be expected 
given the more directionally isotropic properties of the hexagonal lattice, as discussed 
earlier. 

The method described by Jansen and van Wijk [228] starts with large cells, each 
of which encloses many final elements (these are identified with pixels). A sample is 
then placed in the center of the lower-left element of each cell and evaluated. The 
entire cell is assigned a single constant intensity from that sample; all points within 
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(a) Right triangular subdivision, (b) Centered isosceles triangular subdivision. 


the cell return that value until the cell is subdivided and a new lower-left corner is 
established. When all cells have been evaluated with an initial sample, a scanning 
process starts at the bottom-left cell and proceeds left-right, bottom-up. Each cell 
encountered is subdivided into four equal square subcells. In each subcell on the 
top row and right column, a new sample is placed in the center of the lower-left 
pixel. This sample is not evaluated, but is rather assigned the value of the parent 
cell. Figure 10.71 shows the geometry. 

Each of these four subcells is then examined. The position of each subcell specifies 
which samples are used in the refinement test. On the basis of the result of that test, 
the value of the sample in the lower-left corner of that cell is either left unchanged 
(so it inherits the value of its parent), or it is explicitly evaluated. For the three cells 
in the upper row and right column, the test compares the current value of the cell 
with the values of three neighboring cells. 

The samples used in the refinement test are determined for the four subcells as 
follows, using the naming of Figure 10.72. The lower-left cell is unchanged. For 
the upper-right cell, the cells marked A are selected. For the upper-left cell, the cells 
marked B are selected. For the lower-right cell, the cells marked C are selected. The 
samples in the selected cells are passed to the refinement test and compared against 
the value of the parent cell. If the test fails for any of the three subcells, the sample in 
the lower-left corner of that subcell is evaluated. Otherwise that sample is unchanged 







Initial sampling and subdivision from Jansen and van Wijk [228], 


and thus inherits the value of the parent cell. Recall that the value in any cell is the 
color of its lower-left sample, so if the upper-right sample in a subcell is examined 
but that cell has not been explicitly sampled, the value of the lower-left corner of 
the parent cell is used. When a complete scan has finished, another scan may begin. 
This technique is also used by Bronsvoort et al. [64]. 

The adaptive refinement presented by Bouville et al. is based on increasing the 
sampling density of a diamond pattern [59]. Figure 10.73 shows the approach. A 
square grid is laid down and a diamond sampling pattern (marked P) is used in the 
first level. Where refinement is called for, the edges and centers of the diamonds 
(marked S) are evaluated. The process may be repeated recursively. This is very 
similar to the square adaptive procedure due to Whitted described above, except 
that it is oriented at 45° to the underlying square resampling grid. In this example, 
the initial lattice samples form a checkerboard pattern on the underlying pixel grid; 
after refinement, diamond samples land on pixel centers or corners. 

A similar approach is described by Akimoto et al. [6]. They also assume an 
underlying square resampling grid. The grid is subdivided into supercells containing 
many elements from the underlying grid, and the corners of these cells are evaluated. 
They use an intensity difference test to determine if refinement is required. They 



FIOUII 10.72 

Sample selection geometry and recursion from Jansen and van Wijk [228]. 














note that when the supercells are aligned to the resampling grid, then features at an 
angle to the grid are likely to be missed, as in Figure 10.74. 

To suppress this problem, they alternate between square and diamond patterns 
(with respect to the underlying resampling grid). They begin with the square (ori¬ 
ented) supercells and determine which need refinement. Before proceeding with the 
refinement, though, the value at the center of each square is found (either by explic¬ 
itly creating and evaluating a sample, or by estimating from the corner values). This 
now converts the original square grid into a higher-resolution diamond pattern, as 
in Figure 10.75. 

The refinement of the diamond pattern leads to another square pattern, and the 
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MOURI 10.74 

The “bow tie” between these two checkerboard squares is missed when the supercell is oriented 
with the underlying sampling grid; the result is breakup of the pattern in the image. 


two patterns alternate as needed until the subdivision limit is reached or no cells are 
flagged for subdivision. 


10.9.3 NonuaiHorm Geomotry 

When sampling is nonuniform, the bisection techniques discussed above are not 
appropriate for adaptive refinement. Rather than precisely searching some small 
region for a desired feature, nonuniform refinement generally involves producing 
more samples in the region, such that some relevant statistics of the sampling pattern 
are preserved. In particular, it is often desirable to maintain some sort of Poisson-disk 
pattern or approximation. 


10.9.4 MwItiple-LGvel Sampling 

A multiple-level sampling algorithm precomputes a number of sampling patterns 
at different densities. Typically the lowest-density pattern is used to generate and 
evaluate samples. Then refinement tests are applied to the samples; in regions where 
a higher sampling density is required, the next-denser pattern is applied in that 
region. This process may recur until the highest-density template has been used. If 
an algorithm has n templates, we call it an n-level strategy . 

Suppose the templates are ordered in a list {T\, T 2 , ..., T n }, such that T\ is the 
lowest-density version and T n is the highest. Each template may be generated in- 














The refinement process of Akimoto et al. [6]. (a) The original square grid is built and tested for 
refinement, (b) The cell centers are estimated and the diamond grid is tested, (c) The process 
repeats. 
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dependently to possess the desired statistics for that density. But the construction 
process may take into account the intended use, so that each partial sum of k tem¬ 
plates {Ti, T 2 ,..., 7*;}, A; < n, will also possess the correct statistics. We say that 
a set of templates satisfying this condition is cumulatively compatible. If a set of 
templates is cumulatively compatible, then all the samples in templates 0 to A; taken 
in the region together form a statistically desired set. Otherwise, at step k we must 
decide how to handle the samples from templates 0 to k — 1. Some choices include 
discarding them (this is very undesirable; remember the Sampler’s Credo), recon¬ 
structing each set separately and somehow combining the results, or reconstructing 
all the samples together despite their statistics. 

Cook proposed a two-level strategy for distribution ray tracing [101]. In his 
application, the function being sampled was an image, so the regions were pixels. 
The first pass created sixteen samples per pixel; a second template contained sixty- 
four samples per pixel. This wasn’t strictly a two-pass method since either one 
sampling density or the other was selected before sampling began, based on motion 
estimates within the pixel. 

A similar two-level strategy was used by Mitchell [307]. His initial base pattern 
has a density of about one sample per pixel. The refinement test looks at a cluster 
of eight or nine of these samples. If the test indicates refinement is needed, then a 
second-level pattern is used in that area, with a density of about four or nine samples 
per pixel. Mitchell notes that for a distribution ray tracer, higher densities at both 
levels may be necessary. 

Dippe and Wold [124] use several independent sets of samples in their error 
estimator. Presumably these are uncorrelated sampling patterns of roughly the same 
density. They mention that when refinement is called for, more samples need to be 
generated in the region, but they do not discuss the geometry of these new samples, 
nor the disposition of the sets. 


10.9.5 frtt-BoMd Sampling 

Another class of algorithms uses a tree-based data structure to organize the samples 
taken so far and guide the placement of new samples. 

Kajiya presented a number of ways to use trees for adaptive sampling [234], 
The first is called sequential uniform sampling. The idea is that at any time in the 
refinement process, the sampling pattern consists of a number of regions, with a 
single sample in each region. Any region may be refined by splitting it in two; the 
existing sample will land in one of the two subregions, and a new sample may then 
be placed in the other, empty subregion. 

We can describe this hierarchy of regions with a tree. Each internal node repre¬ 
sents a split region; each leaf node is a region with a single sample, as in Figure 10.76. 
The procedure for splitting a node is summarized in Figure 10.77. 




PltURI 10.76 

A refinement tree and its associated domain. 


There are several ways to go about choosing which node should be split at any 
time. To get the most uniform distribution of samples, the tree can be scanned 
breadth-first, so that all nodes at any given level are split before any nodes below 
that level are split. Kajiya notes that if the nodes are examined in strict tree-traversal 
order (say prefix or infix order), then the result is very orderly. A better way to 
go for breadth-first splitting is to choose the nodes in random order, as shown in 
Figure 10.78. 

This approach will search out the domain spanned by the root region, but not in 
any predetermined order. 

The splitting operation of Figure 10.78 may be generalized to n-dimensional 
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Re f ine Node ( node) 
if node is not a leaf 
then 

Choose a subnode 

Assign a child to subnode 

. . If inside tree, descend to a leaf. 

Refine Node ( subnode) 

return 


Split node 

Put old sample in its region Split the leaf and take a new sample. 
Evaluate new sample in other region 



II9IIRI 10.77 

Node refinement. 


Choose a Subnode ( node) 
if left-child is a leaf 
then return (left-child) 
endif 


If left child is a leaf ’ take it. 


if right-child is a leaf 

then return ( right-child) If right child is a leaf take it. 

endif 

if level (left-child) < level (right-child) and ls-ha.la.ncQd (left-child) 
then return ( left-child) Select left if shallower and balanced. 

endif 


if level {right-child) < level (left-child) and is-balanced (right-child) 
then return ( right-child) Select right if shallower and balanced. 

endif 

if uniform()<0.5 

then return (left-child) , . ff 

. Return either child at random. 

else return ( right-chud) 

endif 



Random order breadth-first refinement. 
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distributions by replacing the binary tree with a k-d tree [36]. In a k-d tree, each 
level corresponds to subdivision along a given dimension. For example, the first level 
might split along the x axis, then the second level along y, the third along 2 , and 
so on. This approach generalizes to any interpretations of the dimensions, not just 
those representing space. 

Since we have knowledge of the regions represented by each sample, we can use 
that information to speed up our process of estimating the local signal using a process 
called hierarchical integration . We can build our approximation of the signal as a 
piecewise-continuous function, where the pieces are given by the sampling regions 
and the values are given by the corresponding sample values in those regions. By 
weighting each sample with its associated area, Kajiya points out, we will often get 
a better estimate of the signal more quickly than if we work simply with the many 
point samples and ignore their associated areas. Use of this technique will probably 
influence the selection and use of the test used for refinement. 

This method may be made more powerful by adding an adaptive element, re¬ 
sulting in adaptive hierarchical integration. Kajiya points out that the refinement 
test may be driven by factors other than just uniformity; we will see some examples 
below. The test may make use of information stored in the sampling tree either for 
other reasons or specifically to improve sampling quality. The different criteria end 
up controlling our descent in the tree, influencing our choice of the region we finally 
split. Suppose we have n different tests, each of which has an output <fo; this is the 
probability that we should choose the left subnode (so (f> = 1 is a sure choice for the 
left node, 0 = 0 is a sure choice for the right, and </> = 0.5 means that the choice will 
be random). We can weight these different probabilities with a scalar > 0 for 
each test; to make things easier, we assume the weights add to 1: 

n 

Y^ w i = 1 (10.33) 

i—l 

We can form a final choice from the sum of the choices: 

n 

<i> = Yl w ^ i (10.34) 

1=1 

Most of the tests discussed above were designed to produce binary results: the 
output is simply whether a region needs refinement or not. With the test just de¬ 
scribed, we can allow more range in our result, expressing just how much an area 
needs refinement. Kajiya tried out five different strategies: 

■ The constant threshold (</> = c). 

■ The random threshold (</> = 0.5). 

■ The difference of integrals associated with the subnodes. 
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■ Statistics based on the samples below the subnode. 

■ User knowledge of the function (<fi comes from a lookup table). 

Finally, we can choose the splitting plane to be somewhere other than the midpoint 
of the range. Specifically, we can choose to subdivide where we think it likely that 
we will have equal amounts of signal on both sides. This is a form of importance 
sampling with dynamic stratification ; we make up the regions as we go. Finding the 
right place to split is usually not obvious. Kajiya points out that if the filter we will 
eventually use is known in advance, then we can try splitting so that we have equal 
integrals under the filter in each of the two subregions. This approach isn’t perfect, 
since it’s really the product of the filter and the signal that we want to balance, but 
it’s a starting point. If the filter is stored in a sum-table as described by Crow [111], 
then finding the midpoint of the filter in any region is simplified. Shirley and Wang 
[401] point out that using the filter as the sampling has some theoretical justification 
for the types of signals we usually encounter in computer graphics. 

In one dimension, suppose that the left and right ends of the interval have values 
F(xi) and F(x r ) in the sum table; we need only search this interval (by bisection or 
other means) to find that point x m where F(x m ) = [F(xi) + F(x r )]/ 2. 

Painter and Sloan [328] also organize the samples into a tree. The domain is 
originally one large region with one sample. When the neighborhood is refined, the 
region is subdivided and the existing sample is associated with the subregion it falls 
in; the other regions are then sampled in turn. The tree structure is used to guide 
the adaptive sampling process by storing some additional information at each node. 
The subdivision tests and sampling structures are closely coupled in this technique. 

Central to their approach is the idea of a target reconstruction density . When 
sampling an image, this is the density of the pixel resampling grid, so they refer to 
this density as the pixel level . They note that the goal of the sampling process is 
to produce the most accurate answer at the target level, so they use two different 
strategies when sampling, one for nodes representing regions above (larger than) the 
target density, and another below. Above the target density, they try to sample so 
that no large regions of the domain are unsampled, and to locate large-scale features 
that will need closer attention. At resolutions above the target density (i.e., nodes 
below the target level), they want only to increase their confidence in the estimate of 
the mean for the target-level parent of that node. 

To guide the sampling process, they save several pieces of information at each 
node: 

■ The area of the region represented by this node. 

■ The mean of all samples at and below this node. 

■ The number of samples at and below this node. 

■ The internal variance : the variance of the samples at and below this node. 
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■ The external variance : the variance of the mean estimates at this node and all 
its sibling nodes (i.e., all first-generation children of this node’s parent). 

The internal variance estimates the local complexity of the signal represented by this 
node, while the external variance gives us a measure of the complexity in this part 
of the subtree. The information at each node is used to give it a priority ; when a 
new sample is to be taken, it will go into the highest-priority node (ties are broken 
randomly). 

Each node at the target level (and each leaf above that level) is given a priority 
formed by the product of external variance and area. Each internal node above that 
level inherits a priority formed by the maximum of its children’s priorities. Leaf 
nodes below the target level are assigned priorities so that of all children of a given 
node, the child with the highest mean variance has the highest priority. 

A leaf node is removed from further refinement, or “closed off,” if the confidence 
level there meets the desired threshold. An internal node is closed off if both of its 
children are closed off. When a node is closed off, its sampling is complete, and no 
further samples will be taken in that node’s subtree. 


10.9.6 Multipl«-Scal« TtMplato Rofinomont 

The multiple-scale patterns mentioned earlier may be used directly for finding new 
sample locations when refining a region. They may be considered a variant of the 
multiple-level approach. We can think of the creation of a multiple-scale pattern as 
the construction of a set of cumulatively compatible templates. Thought of this way, 
the first sample in the pattern defines template Ti, the second sample is template T 2 , 
and so on. This is not a particularly useful point of view, since each template adds 
only one new sample. 

The most straightforward approach is to place the template over the region to be 
sampled, and then evaluate samples from the template one after the other until the 
refinement test is satisfied. The selections may be taken directly from the template 
[308], or as modulated by a filter [294]. 


10.10 Interpolation and Reconstruction 

In Chapter 5 we identified the errors that arise when a bandlimited signal is not 
sampled quickly enough, and copies of the spectrum fold onto the baseband (or 
central copy). This is usually called aliasing . When the signal is correctly sampled 
but incorrectly reconstructed, the errors may look like aliasing errors, but in this 
book we call them reconstruction errors. (Other common names for reconstruction 
errors are post-aliasing errors [310] and, for 2D images, rasterization [109].) 
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Typically in computer graphics we reconstruct in order to resample at a lower rate, 
which usually requires a low-pass filter after the reconstruction filter^ as discussed 
in Chapter 5. The central issues in practical reconstruction are the choices of these 
filters. The reconstruction filter transforms the discrete-time signal into a continuous¬ 
time signal, and the low-pass filter bandlimits that signal so that no frequencies 
remain that are above the Nyquist rate of the resampling pulses. 

In computer graphics practice it is common to see the reconstruction and low-pass 
filters combined into one. When the samples are nonuniform, this can be difficult to 
implement, so the two steps are executed sequentially. When filtering isn’t necessary, 
we often reconstruct and resample in one step, by convolving the signal with the 
reconstruction filter only at the points where we wish to resample. This is usually 
less work than reconstructing the entire signal in the frequency domain, since that 
would require a forward and inverse Fourier transform step. 

The theory of reconstruction for uniformly sampled signals was discussed in detail 
in Chapter 8 and that for nonuniform samples in Chapter 9. Adaptive subdivision 
on a regular grid is an interesting case that is somewhere between the two. From a 
sampling point of view, adaptive subdivision on a grid (such as the square subdivision 
method of Whitted) is properly viewed as a variant of uniform sampling, since there 
is a regular pattern to the sample geometry, though not all the sample locations 
have been evaluated. When reconstructing, it is better to think of the result of this 
operation as a nonuniform set of samples, since we cannot use the regular filters that 
expect a value at each filter location. 

Graphics algorithms typically have not employed the nonuniform reconstruction 
methods of Chapter 9, but instead have relied on simpler^ local approximations 
to the signal. In this section we will review some of the published algorithms for 
reconstructing from a nonuniform set of samples. 

We will usually take the point of view that we have a set of samples that has not 
been processed in any way (except that the samples have been evaluated), and we 
want a new value of the signal at some particular reconstruction point (this is the 
resample point, but this term emphasizes that we are reconstructing and filtering the 
signal before sampling again on a sample-by-sample basis). This location is usually 
associated with a neighborhood, and only samples within that neighborhood are 
involved in the reconstruction. When rendering images, the resample point is often 
the pixel center^ and the neighborhood is the surrounding pixel or a 3 x 3 grid of 
pixels. For other types of signals, the neighborhood is usually implied by the radius 
of the reconstruction filter. Although this filter sometimes has an infinite width in 
theory, in practice it is always zero beyond some distance, and we ignore all samples 
beyond that distance from the reconstruction point. We will call the location of the 
reconstruction point P, its value P v and the neighborhood N; each of the n samples 
in the neighborhood has a location s p and a value s v . 

We will assume our original signal is f(t) and it has a Fourier transform F(u>). Un¬ 
less otherwise stated, we will also assume that f(t) is bandlimited, that is, F(lj) = 0 
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for all |u;| > u; 0 = n/T. The function is sampled at sample locations t n . If the 
samples are uniform, then t n — nT ; otherwise the index n only serves to identify the 
sequencing of the samples. 


10.10.1 Functional Ifcchnlqoos 

Functional techniques are based on transforming the input data in some way, typi¬ 
cally into another domain. They are not directly useful in computer graphics because 
they are impractical for large numbers of samples, but we will review two approaches 
that give the flavor of the approach. 

A functional reconstruction of nonuniform samples has been presented by Kim 
and Bose [244]. They build a frequency-space representation of the signal which 
contains uniformly spaced frequency samples. This representation takes the form of 
a matrix, which is then processed by the inverse discrete Fourier transform to recover 
the original signal. They have found that this is always possible in ID, but that in 
2D the transformation matrix does not always exist. They present the necessary 
conditions for the existence of the 2D matrix, and discuss a block algorithm to make 
the matrix computations more efficient. 

Wingham has taken an approach whereby the signal is expanded in a series form 
[483]. The linear algebra method of singular value decomposition is used to identify 
the eigenvalues and eigenvectors of the signal. These measures provide a way to 
reconstruct the large-scale structure of the signal, and offer tolerance to noise in the 
signal. 

Although functional methods are capable of extracting a lot of useful informa¬ 
tion from the signal, they typically require processing of the entire signal at every 
stage. For a typical computer graphics image with a million or more samples, this 
is very expensive; even for an illumination hemisphere of several hundred samples 
and carefully designed implementations, the costs are probably prohibitive. The 
examples used by Kim and Bose reconstruct the signal from only nine sample points, 
and Wingham presents his results on a signal with eight samples. 

10*10.2 Warping 

The family of warping techniques is based on a simple idea: if we can map the 
nonuniform samples onto a uniform grid with an invertible mapping, we can recon¬ 
struct on the uniform grid using traditional methods, and then invert the mapping 
to get the result in the original space. We will summarize the work of Clark et al. 
[92] as a representative sample of this approach. 

We begin by recalling the ID uniform sampling theorem from Equation 8.13. If 
f(t) is a function with Fourier transform F(w) such that F{oj) = 0 for a; > u>o = 7 r/T, 
and is sampled at the points t n = nT, then f(t) can be exactly reconstructed from 
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fit) 




FIOURI 10.79 

(a) A function f(t) sampled nonuniformly at locations {/„}. (b) The warped function /i(r). 


its samples from 


m = E f(nT) sine 

n 


— (t-nT) 

7T 


(10.35) 


We now consider the nonuniform sample sequence {£ n } applied to /(£), as in 
Figure 10.79(a). We would like to warp the t axis so that all the samples are 
uniformly spaced with interval T. Suppose we had a warping function 7 (£) that gave 
us a new axis, r. Then the image of f(t) on that axis, which we call /i(r), would be 
uniformly sampled, as in Figure 10.79(b). 

If we can carry out this mapping (and h(r) is bandlimited), then we can use the 
techniques of uniform reconstruction on h(r). If the mapping function 7 is invertible, 
then we can recover our original function f(t) = h(^y(t)). 

We can state this observation as the ID nonuniform sampling theorem : 
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The ID Nonuniform Sampling Theorem: A signal f(t) is 
sam pled nominiformly at points t = t n * If a one-to-one 
continuous mapping 7 (f) exists such that nT — and if 

ft(r) = /{i -1 (t) ) is bandlimired to u ; 0 = njl\ then f(t ) may 
be reconstructed from 



(10.36) 


This theorem requires that h(r) be bandlimited, but it says nothing about the original 
signal f(t). In fact, it is rather surprising that f(t) need not be bandlimited for the 
reconstruction to proceed successfully. 

There is an alternative way to write Equation 10.36 that can give some different 
insights into the problem. Consider a signal that is very smooth in some places 
and changes quickly in others. It seems reasonable that if we only can take a 
predetermined number of samples, they ought not to be distributed uniformly, but 
rather should be clustered densely near the busy parts of the signal, and scattered 
more sparsely where the signal is smooth. In other words, the density of samples in 
some interval of the signal should match the amount of high-frequency information 
in that interval (this is importance sampling reappearing). 

To estimate the local high-frequency content, we assume that at every point on 
f(t) we have an associated bandwidth estimate B(t). Then we know that if we have 
at least 2 B(t) samples per unit time in that interval, we can exactly reconstruct the 
signal. Said in reverse, we can find the desired spacing between samples from the 
implicit relation 


Equation 10.37 gives us an implicit relation on the spacing of the samples so that 
they capture all the high-frequency information and the signal may be reconstructed. 
Our reconstruction formula is then 


f{t) = X /(*") sinc 


(10.38) 


n 


The instantaneous sampling rate at some point in the signal is given by d / y(t)/dt i 
the derivative of the warping function, so 


dy(t)/dt = (27 t/ljo )B(t) 


(10.39) 


Integrating both sides, we find 



(10.40) 
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If the local bandwidth B(t) is roughly constant in some interval around t , then we 
can simplify Equation 10.40 to 


7(*) = (27T/uj 0 )B(t)t (10.41) 

When this approximation holds, we see that Equations 10.36 and 10.38 are equiva¬ 
lent. 

The reconstruction formula of Equation 10.38 represents convolution of the 
sample points f(t n ) with the inverse Fourier transform of time-varying low-pass 
filter with cutoff frequency up = 47 r B(t). Determining B(t) is difficult because it is 
intended to be a local measure, and Fourier transforms require the entire signal to be 
used. One way to estimate the local bandwidth is to use a wavelet transformation. 
Alternatively we can just take a short-term Fourier transform and hope that it is 
a reasonable local estimate. But since we never use infinite-width signals or filters 
in practice, if we window the signal and take a transform of that local region, we 
are only increasing the severity of an error we regularly commit and are content to 
tolerate. 

The essential step of this method is to find the warping function y(t) that satisfies 
our invertibility and bandlimiting conditions for a given set of samples {£ n }. In one 
dimension this seems like a plausible task; it is much harder in general in 2D. This 
is due to one of the classic problems in generalizing problems of this type: in ID 
we know how to sort without ambiguity, but in 2D we do not. We may try several 
heuristics to “juggle” a set of samples into some uniform lattice, but no algorithms 
are known that will preserve connectivity and adjacency for all nonuniform sample 
geometries. 

The situation is different if the sampling pattern is always the same, or is one of a 
small number of known patterns. It may then be possible to precompute a warping 
function 7 * for each pattern i. This process may require significant expense and a 
priori information about the sampling pattern. When the warping functions have 
been constructed, they may be stored and then used directly in the reconstruction 
formulas above when that pattern is encountered. This approach is reasonable in 
any number of dimensions for which the warping functions can be computed. 

This discussion may be generalized theoretically into two dimensions. Suppose 
that we have a 2D function /(x) which has been sampled at some set of 2D points 
{x s }. We assume we have a continuous, one-to-one mapping 7 : 

£ = 7 (x) 

£™= 7 (xs) (10.42) 

where the former equation represents the transformation of any point x, and the 
latter is defined only for sample points x s . We can also assume another function h 
defined by 


M0 = /(7" 1 (0) 


(10.43) 
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If h is bandlimited, then we can write the sampled signal after passing through a 
low-pass filter g as 

= (10.44) 

Xi 

We can now write / as 

/(x) = £ f(x.)g( 7 (x) - 7 (x s )) (10.45) 

(M 

As before, we can state this in a theorem, this time as the 2D nonuniform sampling 
theorem: 


The 2D Nonuniform Sampling Theorem: A signal /(x) is 
sampled nonuni tor mly at points x = If a one-to-one 
continuous mapping 7 (x) exists such that olJ + hV = 7 (x„) 
for two linearly independent vectors U and V, and if h(r) — 
f(l '(r)) is bandlimited to ]i^o| — tt/ min([U| t |V|),then /(t) 
may he reconstructed from 

/(x) = V /( Xs )s( 7 (x) - MxJ) (10.46) 

{*4 

where g is the inverse Fourier transform of an ideal 2D low- 
pass filter. 

Again, this theorem states nothing about the spectral properties of the original 
function that was sampled and that we are reconstructing; only the projection of the 
signal through the warping function 7 is required to be bandlimited. 

This warping approach has been studied for computer graphics by Heckbert 
[206]. He examined a number of different filter designs for texture and image 
processing. In particular, he found that rather than warping the signal and then 
processing, you can sometimes apply the inverse warp to the reconstruction filter 
and then apply it direction. 

Another approach to the warping process has been taken by Wolberg [485], 
who studied the requirements of nonuniform reconstruction for applying complex 
transformations to images. 


10.10.3 Iteration 

The methods based on iteration generally work by guessing an estimate of the signal, 
plugging that into the known samples of the function, and using the error to derive 
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a new approximation. We will summarize the work of Sauer and Allebach [376] as 
a representative sample of this class; other examples of this method include [330], 
[281], and [142]. Sauer and Allebach actually present three variations on the same 
theme; we begin by describing the basic method and then describe the two variations. 

We suppose our signal f(t) has been sampled nonuniformly. Based on the sam¬ 
ples, we make a guess of the function and call the guess We then apply a pair 
of operators V and Q to find a new approximation / 2 (f): 


fk+i =Tfk = VQfk 


(10.47) 


where we have used a composite operator T to represent the sequence VQ . Applying 
Equation 10.47 over and over gives us a sequence of estimates fk for the original 
signal. 

If our samples of f(t) are sufficient to uniquely determine the signal, and the 
algorithm is convergent, then eventually we reach a fixed point where our estimates 
match the signal: 

lim 77* = / (10.48) 

k—*oo 


One way to view this type of algorithm is to think of V and Q as projecting 
their arguments into particular spaces. When these spaces are different for V and 
Q, the representation of each estimate fk bounces back and forth between the two 
representations. Typically these are signal space and frequency space, so each step of 
improvement involves modifying the signal and its spectrum. This process is called 
alternating nonlinear projections onto convex sets, and there is a substantial body 
of mathematical literature addressing the subject in general [499]. Implicitly, V and 
Q include forward and inverse Fourier transforms. 

In Sauer and Allebach [376], the signal-space operator is called the sampling 
operator , and is denoted S. This operator identifies how to evaluate our estimate at 
any point. The geometry behind S is shown in Figure 10.80. 

We start by determining a scalar e by searching all pairs of samples x* and finding 
the nearest neighbors; e is set to half that distance. To find the value of the operator 
S on a function g(x) at any point x in 2D, we find the nearest sample x* to x, form 
a ball of radius e around x i? and integrate g(x) over that ball. In ID, a ball is an 
interval, in 2D it is a circle, in 3D a sphere, and so on. We assume that the definition 
of e results in a value small enough that g(x) is almost the same as g(xi) for each 
point x in the ball around x*. 

In symbols, we write the sampling operator 5ona vector-valued function g(x) 
as 


Sg = 



for || x Xj|| < e 

otherwise 


(10.49) 
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The circular regions of integration of the operator S. 


where B eyi is a ball of radius e centered at x*, and 


0 < e < - min ||x* 
2 i,fc 


Xfc I 


i ^ k 


(10.50) 


If we evaluate S(f - fk) at each sample point, we find the amount of error 
between estimate k and the original signal. 

The frequency-space operator is called V 2 , and it is simply a perfect low-pass 
filter. Operating on a spectrum G, the operator V 2 passes G for all frequencies uj 
within some finite interval T centered at the origin, and sets the others to 0. In 
symbols, 


lo 


u e r 

otherwise 


(10.51) 


The basic technique is to find the error in our estimate fk by finding S(f — /*), 
scaling that by some amount A 6 71 and adding that correction term back into 
fk . The resulting “corrected” signal is then low-pass filtered, resulting in the new 
estimate. We can write this iteration algorithm for fk+\ as 


fk+i = Tfk = V 2 [fk + A S(f - f k )} (10.52) 


Note that V 2 may be implemented by a frequency-space box if it is surrounded by 
a Fourier transform, or as a signal-space convolution. It can be proven that when 
0 < A < 2, the algorithm of Equation 10.47 will converge. 

Equation 10.47 can be written in a slightly different form that will make it easier 
to compare it with the variations below. We write an operator A\ operating on a 
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signal g as 

Aig = g{x) + S [/(x) - s(x)] (10.53) 

Then the iteration becomes 

fk = Tfk = V 2 [1 + A (A x - 1 )] f k (10.54) 

Three variations on this algorithm are also given in Sauer and Allebach [376]. 
We call the technique of Equations 10.47 and 10.54 method 1. Method 2 is similar, 
but approximates the signal as a triangular mesh over a number of sample points. 
This signal is then refined by alternating projection in signal and frequency spaces, 
as before. 

In method 2 the frequency-space operator V2 is unchanged, but A\ is replaced 
by A 2 = RS, the product of a new operator R and the sampling operator S from 
above. The basic idea is to treat the sample points as the vertices of triangles. The 
signal at any point p may then be found by evaluating the plane equation of the 
appropriate triangle at that point: 

Rg(p) = ap x + bp y + cp z + d (10.55) 

for a plane with normal (a, 6, c ) and constant offset d. So A 2 U - fk) = RS(f - fk) 
is a collection of triangular facets. The iteration equation may be written 

fk + 1 = Tfk = V 2 [fk 4 - A A 2 (f - fk)} (10.56) 

In method 3, an additional constraint is introduced to compensate for the fact 
that the farther a point is from a sample point, the less we know about what the 
signal should be. Thus, the error at these points is weighted less. A distance function 
is introduced and a new operator ^3 is used to include its effect. 

Although the convergence condition mentioned above is rigorous, Sauer and 
Allebach note that it is often useful to start the method with a much more aggressive 
degree of overrelaxation. If successive iterations begin to diverge, they decrease the 
value of A and perform that iteration again. In their paper they compare the three 
methods above against each other and the thin-plate spline method of Franke [149]. 

Although iterative techniques are theoretically appealing, they suffer the draw¬ 
back of requiring a great deal of computation. If the signal is an image, then at 
every iteration we must either compute a forward and inverse Fourier transform, or 
convolve with a very large approximated sine function. When pictures have thou¬ 
sands of samples on a side, this calculation time can become prohibitive. We have 
presented them here because they are theoretically interesting, provide good results, 
and may be practical in situations where only a few samples are involved. 
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Subdivision in Whitted’s method, (a) The initial sampling at comers and center, (b) The four 
squares defined by this sampling pattern, (c) A visual expression of the weights of the samples. 


10.10.4 Mucuwisu-Confinuous Reconstruction 

The most straightforward reconstruction is to assume that the neighborhood is tiled 
by regions (that is, the regions fully cover the neighborhood with no gaps over 
overlaps). The regions R are built from the samples, so each region i has the 
value Ri. 


Whitted's Method 

Whitted’s reconstruction method [477] essentially builds rectangular regions in the 
neighborhood, and then box-filters the resulting signal to form a single flat recon¬ 
structed surface over the neighborhood. Whitted does not provide the details of the 
area subdivision technique, but one reasonable interpretation proceeds as follows 
[156]. 

Initially, the four corners and center of a square with unit area neighborhood 
are evaluated, as in Figure 10.81(a). After these first five samples are evaluated, all 
further decimations of the neighborhood will be based on squares defined by two 
diagonally opposite corners. We will consider the signal in each of these square 
regions to be constant, with intensity given by the average of the sample values on 
the corners. Thus after the first step, we have the four corners A, B, C, and D , 
and the center E. As shown in Figure 10.81(b), each corner-center pair defines one 
subsquare with total area 1/4 and intensity given by the average of the samples. So 
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One level of subdivision, (a) The new sampling points F, G, and H. (b) The evaluation of the 
subpixels, (c) A visual expression of the new weights of the samples. 


our first estimate for the value of this neighborhood is 

„ 1 (A + E , B + E , C + E , D + E\ 
'" 4 \ 2 + 2 + 2 + 2 ) 


A visualization of these weights is shown in Figure 10.81(c), where the size of the 
area associated with each sample is equal to its weight (these areas are meant only 
to communicate the size of the weight, and are probably not the best shapes for 
real areas associated with each sample; Voronoi regions are better and are discussed 
below). 

Suppose that the upper-right corner needs refining. We take new samples F, 
G, and H , as shown in Figure 10.82(a). Then the 1/4 total contribution due to 
that square is now distributed among the five samples there, using the same basic 
weighting as in Equation 10.57. Thus the term (B -I- E)/2 is replaced by a new 
expression 


B + E l (E + G F + G B + G H + G 

—2- > 4 \—2- h —2- f —2 h 2 


(10.58) 


and diagrammed in Figure 10.82(b). 

We will follow the process through two more steps to clarify the subdivision 
and show how some samples may be shared among regions. If we refine the lower- 
right corner of the square defined by points G and H, then the term (H + G)/2 in 
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MOURI 10.83 

Assigning weights to a subdivided square. 


Equation 10.58 becomes a new expression in the new samples J, K , and L: 


H + G 
2 


1 (G + K L + K H + K J + K\ 
4\2 + 2 + 2 + 2j 


(10.59) 


Finally, we will refine the upper-left subsquare defined by points A and E in the 
original neighborhood, and create new samples Q and R. We already have sample 
F from the previous subdivision. The term (A + E)/2 in Equation 10.57 becomes 


A + E lfA + R F + R E + R Q + R\ 

— + — + — + —J 


(10.60) 


If this is the end of refinement, then we can express the final value P v for the neigh¬ 
borhood by combining Equations 10.57 through 10.60. A little algebra confirms 
that the weights add up to 1.0. This process is illustrated in Figure 10.83. 


Wyvlll and Sharp's Method 

Another piecewise-linear tiling was presented by Wyvill and Sharp [491]. If a square 
neighborhood has samples only on its edges, they assume that all the edges in the 
signal are linear and radiate from the center of gravity of the samples on the square’s 
sides and corners. They assume that a maximum of one edge can pass through each 
square side. Since every edge must enter and then exit, there will be 0, 2, or 4 points 
on the square edges, as shown in Figure 10.84. 
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If edges are linear and only one edge may pass through each square side, then there will be 0, 2, or 
4 intersections of signal edges and the square sides. Here we show the intersections, their center of 
gravity, and the radial pattern associated with the configuration. 


They assume that the color of each patch within the square is constant, so one 
can take the value at the corner and weight it by its associated area. Figure 10.85 
gives the formulas for this reconstruction, using the notation of Figure 10.86. 

To illustrate, suppose that we have a ygrtiggl edge that intercepts the top and 
bqjtqm of the pixel, at p t and respectively. Then from the seventh line of 
Figure 10.85, we compute 


a= ^(l-Pr)(l-Pt) 


( 10 . 61 ) 
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Central-star reconstruction. For all but the last row, the value for a is applied to the two comers 
indicated. A bullet indicates a disallowed configuration. 
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Pt ( 1 , 1 ) 



PIGURI 10.86 

The notation for the reconstruction scheme. The square has origin (0,0) in the lower left and (1,1) 
in the upper right. Each intercept has a value from 0 to 1. The values of the corners are weighted 
by the corresponding area. 


and compute the color C from C = (a UR) + ((1 - a) LL), where we have chosen 
arbitrarily to use UR for the right color and LL for the left color; the colors at UL 
and LR respectively would have done as well. 

The most important assumption in this method is that the internal edges are linear, 
there is only one per edge, and they may be modeled by the star-radiating pattern. 
Figure 10.87 shows three interpretations of the same data: two opposite corners are 
white, the other two are black, and the four transition points are in the center of 
each edge. If white is given value 1 and black 0, we can sensibly make estimates of 
0.25, 0.75, or 0.5 for the final value. When pixels are simple, the reconstruction 
discussed here is probably reasonable, but when a situation is sufficiently hard, a 
more powerful reconstructor should be used instead. 


Painter and Sloan's Method 

Painter and Sloan [328] use a piecewise-constant reconstruction based on a stored 
tree of samples. If the samples were generated using a k-d tree, then there is an 
associated data structure of fc-dimensional boxes, where each box has one sample. 








10.10 I 


p ol a ti 


d R 


5 1 3 



FIOURI 10.17 

The same configuration of corner colors and edge intercepts can lead to three different values for 
the neighborhood, (a) C = 0.25. (b) C = 0.75. (c) C = 0.5. 


In 2D, these boxes are just rectangles in the plane. They assume that the entire 
rectangle has a single uniform color given by its representative sample. 

This makes it particularly easy to apply the low-pass filter prior to resampling. 
Figure 10.88(a) shows a roughly Gaussian filter placed over a region of the plane 
tiled with rectangles. The filter’s total response can be broken down into a sum 
of responses, one per rectangle, as shown in Figure 10.88(b). We can write this 
symbolically as the result of a filter function f(x , y) over a signal s(x, y ) defined in a 
2D neighborhood N : 

P v (x,y) = / f(x - u, y - v)s(x,y) dudv 

Jn 

= Ti / f(x — u„y - v) dudv (10.62) 

J W) 

where in the second line we have replaced the signal s(x,y ) as a sum of t rectan¬ 
gles with area A(i) and value r;. Since the rectangles are disjoint, each integral is 
independent of the others. 

The advantage of this observation is that we have efficient tools for finding the in¬ 
tegral of a tabled function over any rectangular region. Some filters (e.g., polynomial 
functions) may be simple enough that we can do the integration analytically. Oth¬ 
erwise, we can convert the filter into a sum table [111], so that four table lookups, 
three adds, and a divide will give us the integral under any rectangular region. Each 
rectangle Ri needs to be clipped to the sum table boundary T before access, which is 
simply a box-box intersection. If we have a routine sum-table (/) that represents 
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PIOURI 10.88 

(a) A filter f(x, y) over a domain tiled with rectangles, (b) That filter’s response is the sum of its 
response to each rectangle. 


the filter function f(x, $/), then we can write the total response as 

P v (x,y) = Y r x Siam-table (/) (Ri n T) (10.63) 

z 


If the filter is separable, then it can be stored as two ID sum tables rather than 
one 2D sum table, reducing the storage costs from 0(N 2 ) to O(N), where N is the 
number of samples we have used per dimension to digitize the filter. 

For image reconstruction. Painter and Sloan recommend a Laczos windowed sine 
filter with a 7 x 7 pixel support [185]. 

Although piecewise-constant rectangles provide a nice mathematical structure to 
the image, and permit efficient calculation of filtered values, higher-order interpola¬ 
tion methods are likely to yield more accurate values. This is worth investigating 
since a better reconstruction scheme may allow us to reduce the number of samples 
we need to evaluate. Painter and Sloan suggest using the sample locations to produce 
a Delaunay triangulation of the plane [346]. 

We can apply the same piecewise-constant technique applied above to rectangles 
for these triangles, or use a higher-order interpolation scheme. Two good candi¬ 
dates are the algorithms by Cendes and Wong [79] and Salesin et al. [372]; the 
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riOURI 10.89 

Uncertain and certain samples are modeled by loose and stiff springs, respectively, and a thin plate 
is built to accommodate those springs. 


latter is particularly powerful because it can handle discontinuities of the signal over 
triangle edges. The convolution step is a bit more difficult with triangles than rect¬ 
angles because sum tables are only efficient for rectangular regions; efficient discrete 
convolution over triangular regions is still an open problem. 


Thin-Plato Spllnos 

Franke reconstructs a univariate function from nonuniform sample points by fitting 
a thin plate through the data [149]. The plate is defined by a set of splines, and is 
everywhere continuous in its first derivative. 

A similar reconstruction method has been suggested by Metaxas and Milios [299]. 
At every sample point a spring is placed; the rest length of the spring is defined by 
the value of the sample at that point, and its stiffness is related to the confidence 
associated with that value. So if we have some sample whose values we are uncertain 
of, those values would be represented by relatively loose springs, while those values 
we are confident about would be represented by stiff springs, as in Figure 10.89. 

The surface v is created to minimize the total potential energy in the sheet. This is 
the sum of two energy factors: the deformation in the sheet, and the accommodation 
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to the given data represented by springs each of which has an associated rest length 
and stiffness. The sheet minimizes the potential energy of deformation E v , given by 

E p {v) = i j(v. \ x + 2 vl y + v 2 yy )dxdy (10.64) 

The region A is the area over which we will build this piece of the sheet, and the 
subscripts on v are partial derivatives. To accommodate the springs, we introduce 
a constraint function C(v) = 0; the energy cost in matching the constraint is E con , 
given by 

E CO n{v) = J @{x,y)[C(v(x,y))] 2 dxdy (10.65) 

where &{x,y) > 0. Since we have a finite number of springs, we can write this as a 
summation 

E C on(v) = ^ Y]0(xj, yj)[t;(xj, &) -c(x,,2/i)] 2 (10.66) 

A 

where c{xi, yi) is the value of sample i, and /?(xj, yi) is our confidence in that value 
(or, conversely, the amount of noise we think is in that value). The total potential 
energy is the sum of these two energies: 

E v = E p (v) + E con (v) (10.67) 

Metaxas and Milios solve this problem with a finite-elements approach, breaking 
the signal domain into a finite number of squares and solving for a solution over 
each square that is C° and C x continuous with the adjacent square. 

This approach is different from the preceding methods in a number of ways. Most 
significantly, it has the ability to accommodate noisy data. It also has an interesting 
continuity condition: as defined, it produces a result that is continuous in value and 
in first derivative everywhere on the surface. This can be both an advantage and a 
problem. 

The effect of enforcing this continuity is that abrupt changes in the signal will be 
smoothed out. A disadvantage of this is that if the neighborhood is large, then sharp 
edges and corners will be lost. This can be a problem if the technique is used in the 
image plane, where we typically want sharp boundaries between objects to remain 
sharp in the image. On the other hand, if the signal needs to be low-pass-filtered prior 
to resampling, this technique produces a result very similar to that operation. An 
analytic description of the frequency-space properties of this algorithm has not been 
carried out, so it is difficult to say just what sort of low-pass filtering is performed, 
and where the transition band is located. 



10.10 Interpolation and Reconstruction 


5 1 7 


Metaxas and Milios recognize the challenge of discontinuities and propose a 
method to automatically detect them and break the continuity constraint at those 
places during surface fitting. They estimate the bending moment of the sheet and 
consider any bend that is too large to effectively mean there is a big change in signal 
value, implying the presence of an edge. They note that the edges are not very well 
detected, and that when used for images, there is interference of colors around object 
edges. 

We can get additional insight into the utility of thin-plate spline methods from 
observations due to Sauer and Allebach [376]. They measured the signal-to-error 
ratio against known test data for iterative methods and a thin-plate spline method. 
They found that their iterative method had a better signal-to-error ratio after only 
three iterations, and that after forty iterations the improvement over splines is gener¬ 
ally between 4.5 and 33.5 dB. The result seems to be that iterative methods produce 
results of equal quality to spline methods after only a few iterations, and that it¬ 
erative techniques continue to improve after that, while the spline solution, once 
determined, remains fixed. 


10.10.5 Local Filtering 

Another approach to nonuniform reconstruction is essentially the same idea that is 
used when we want to resample a uniformly sampled signal, which is to simply apply 
a reconstruction filter (or a combined reconstruction/low-pass filter) directly to the 
sampled data, centered over the desired reconstruction location. Exactly how a filter 
is applied to nonuniform sample geometry varies from algorithm to algorithm. 

A good survey of filter characteristics with regard to image quality is given in 
Mitchell and Netravali [310]. The following discussion follows their presentation 
and uses their images to demonstrate filter behavior. 

There are several criteria we want our reconstruction filters to achieve. Of course, 
we want their frequency-space response to be such that they do not attenuate the 
baseband (central copy of the signal spectra) too much, and that they allow only a 
little energy from spectral replicas of the baseband to pass through. The ideal box 
filter matches these two constraints perfectly, though it is not realizable in practice 
because it has an infinite impulse response. 

Most filters used in graphics are space-invariant , meaning that their shape and 
size do not change with respect to the domain being filtered. The alternative is 
a space-variant filter, such as the elliptically weighted average filter developed by 
Greene and Heckbert [168]. They also present a good summary of different filter 
types for texture-mapping applications. 

After discussing some filter characteristics, we will survey some filters that have 
been proposed in the literature. 
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Filter Criteria 

It may be surprising to note that when a picture is reconstructed with a practical 
implementation of the theoretically ideal low-pass filter, the result is often not a 
satisfactory image. An example is shown in Figure 10.90(a) (color plate), where 
we have enlarged an image by convolving with a very wide sine filter. Notice 
the distracting ringing around the edges. This ringing is a combined result of the 
truncated reconstruction filter and the fact that our image function is effectively 
windowed by a box filter; the image is 0 outside of its square boundaries. This sudden 
discontinuity in intensity introduces some high frequencies into the signal that show 
up in the reconstruction. So it is insufficient to apply even a good approximation 
of the sine to an image and expect a good-looking result; practical reconstruction 
requires a closer look at the filters and their effect on the image. 

A useful set of criteria for characterizing a filter include sample-frequency ripple , 
anisotropic effects , ringing , blurring , and reconstruction error or post-aliasing . These 
are discussed in turn, following the presentation by Mitchell and Netravali [310]. 

Sample-frequency ripple occurs when the DC component of the signal aliases 
and is included in the reconstructed signal. The name comes from the fact that 
this component shows up in the spectrum at multiples of the sampling frequency. 
The visual artifact is shown in Figure 10.90(b). To remove this from the signal, we 
would like our filter to be zero at all multiples of the sampling frequency; then DC 
components at those frequencies will be canceled out. 

Anisotropic effects are visible when the filter has unequal response in different 
spatial directions. A particularly common example of this type of problem is when a 
separable filter is used on a square resampling grid, and the underlying square pixels 
show through. An example appears in Figure 10.90(c). 

Ringing occurs when an edge turns into a rippling set of lighter and darker 
bands, as shown earlier in Figure 10.90(a). This is a result of alternating positive 
and negative lobes in the filter response, which serve to increase and decrease the 
signal in the neighborhood of an edge. Ringing can be useful for edge sharpening 
when it is carefully controlled. 

If the filter attenuates the baseband too much in the higher frequencies, then sharp 
edges become blurry, as in Figure 10.90(d). 

If the filter allows too much of the spectrum beyond the Nyquist rate to survive 
in the reconstructed image, we get post-aliasing, as in Figure 10.90(e). 

In practice we cannot satisfy all of these criteria at once, but rather must make 
some trade-offs and get good behavior in some categories at the expense of inferior 
quality in others. 

Mitchell and Netravali point out that if each sample contains not just its value 
but also the derivatives of the signal at that point, then reconstruction may proceed 
more accurately or more quickly with a given number of samples. This information 
is usually difficult to obtain in rendering systems. 
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Pavicic [335] has suggested that one characterization of a good filter is that when the 
original signal is flat (i.e., all samples have the same value), then the reconstructed 
function should also be flat. That is, if all the sample values are 1.0, then the 
convolution sum h of the sampled signal s and the filter / in Equation 10.68 should 
be as close as possible to 1.0 for all values of x and y. 

oo oo 

h(x,y) = 53 s(i,k)f(i-x,k-y) ( 10 . 68 ) 

i= — oo k=—oc 

This problem is very similar to the flat-field display response studied in Chapter 3. 
There we were given the display function (which was also radially symmetrical) and 
we sought the proper intersample spacing to achieve a flat field. Here we have the 
opposite problem, where we are given the spacing but seek a reconstruction filter 
that achieves a flat field when evaluated at all points within the field. 

To measure the quality of the field, Pavicic proposed a test situation where all 
samples have height 1 on a square lattice of side 1. His measurement involved finding 
the volume of the difference between a flat sheet of height 1 and the filter response 
/ over the square. We call the absolute error at each point e(x, y): 

e(x,y) = I f(x,y) - 1| (10.69) 

Pavicic integrates this error over the sample square to get a single volume measure e v : 

e v = Jj e(x,y)dxdy (10.70) 

Max [284] also finds the maximum difference in contrast, e c , over the recon¬ 
structed signal h = f * s: 

e c = max h(x,y)- min h(x,y) (10.71) 

0<a:,2/< 1 0<x,y<l 

He has also used the RMS error crms of e(x, y) over a dense set of n points inside 
the square: 

£rms = \/~y^y^e 2 (f, *:) (10.72) 

V Tl , 

t k 

We will use these measures below to compare filters. 

N • r m a I i i a * i • n 

It is important to normalize our filters, which requires dividing their response by 
their total volume; this makes their response to a flat-field input of 1 equal to 1. This 
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is particularly easy if the filter is radially symmetric. The cylindrical-shell method 
from vector calculus says that if we have a radially symmetric filter defined by f(r) 
evaluated in an interval [0,6] where f(r) > 0 for all r € [0,6], then the volume 
obtained by rotating this around the Z axis is 


V 


= 27 t f rf(r) dr 

Jo 


(10.73) 


If the filter is given piecewise, the integrals may be done over smaller intervals and 
summed together. 


N • I s • Sensitivity 

In the physical sciences, unwanted noise can influence the samples of a signal in 
several ways. Many techniques have been developed to deal with various types of 
noise during reconstruction. We don’t have the same kind of explicit problem with 
noise in computer graphics, since in theory we can compute with arbitrary accuracy. 
In fact, there is always a limit on the precision with which we carry out any of our 
calculations, and there is quantization error on top of that. A careful characterization 
of the errors we introduce depends on the precise algorithms used, the program that 
implements them, and the computer that the program runs on. But if we use a 
nonuniform sampling process, then we are deliberately introducing high-frequency 
noise into our samples to avoid regular aliasing artifacts. An important problem 
in reconstruction from nonuniform samples is to filter out this high-frequency noise 
before we resample the signal. 

One form of high-frequency noise is known as shot noise , which occurs when a 
few samples out of many have a significantly different value than the others. Such 
samples are sometimes called rogues or outlyers. 

The technique of Metaxas and Milios discussed above is tolerant of some noise, 
but we need to identify the rogues and give them large spring constants or the 
algorithm will attempt to match those samples, and perhaps even deduce the presence 
of an edge. This is not a problem unique to their algorithm; robust tolerance of noise 
is a difficult problem. 

Lee and Redner [260] have suggested using a class of nonlinear filters called 
alpha filters to eliminate rogues. These filters may be used in conjunction with other 
reconstruction techniques. The basic idea is that in any neighborhood, we gather 
together the n samples Si and sort them into increasing order. We then discard 
samples from both ends of the sorted list, starting with si and s n , then removing s 2 
and s n - 1 , and so on. The number of samples to remove is given as a percentage a 
of the number of samples n. 
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The alpha-trimmed mean s Q for the n samples {si, 52 ,..., s n } is defined by 


n— |_anj 


n — 2[an\ 


Y Si 


n, odd 


i=\an\ +1 


S C!L - \ 


1 


n — 2[a(n — 1)J 


n-|a(n- 1)J 

y: Si 71, even 


i— [a (71 — 1 )J +1 


(10.74) 


Note that when a = 0, no values are trimmed from the summation, and the result is 
the sum of all the samples; this is equivalent to a box filter. When a = 0.5, then we 
get back the median value of the set of samples. The filter is also dependent on its 
width w , which is the radius of a circle about the pixel center; samples falling within 
this circle are included in the filter. 

Lee and Redner discuss the use of alpha-trimmed means in a variety of disciplines, 
and stress its utility for removing rogue values while preserving edges. They note 
that the filter is usually much too large for typical image-synthesis applications if 
samples are assumed to be distributed with a density of about one per pixel. Either 
we should work with a much higher sampling density, or create interpolated values 
near a pixel center derived from the pixel’s neighbors. 

They also note that repeated application of the filter with different values for the 
parameters (a, w) usually gives results that are superior to a single application. For 
a relatively smooth ray-traced scene (no textures and mostly diffuse objects) with 
eight samples per pixel, they reported good results from the two-pass combination 
[(0.5,1.0), (0.4,1.0)]. For an image with more high-frequency content, the two-pass 
combination [(0.5,0.5), (0.4,0.25)] worked well. They do not give suggestions for 
automatically picking values of a and w from the samples themselves, but it seems 
that as the image becomes more complex, a good approach is to use a filter with a 
radius on the order of half the intersample spacing and a relatively high value of a 
(near 0.5), to remove most of the pops. A second pass with a smaller radius and 
smaller a removes the less dramatic rogues without overly softening the edges. 

Alpha-trimmed mean filters are useful when the sampling density is roughly 
uniform in the neighborhood being sampled. They are not appropriate in regions 
of extreme variation of sampling density. If the filter sits over a region where there 
are many samples clumped together, they will tend to overwhelm the samples in the 
more sparsely sampled part of the region, leading to an incorrect average (this is true 
even when a = 0). The second discussion on multistage filtering below addresses 
that issue. 
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The three types of sample patterns studied by Yen [497]. (a) Migration of a finite number of points, 
(b) A single gap in the distribution, (c) Recurrent nonuniform sampling. Redrawn from Yen in IRE 
Transactions on Circuit Theory , figs. 1-3, pp. 252-253. 


10.10.6 Ym's Method 

Yen [497] studied reconstruction from three types of nonuniform sampling patterns. 
These are illustrated in Figure 10.91. 

The first type of pattern occurs when a finite number of samples move from a 
regular pattern to slightly different locations, as in Figure 10.91(a). The second 
pattern occurs when a single discrepancy occurs in the pattern; for example, all 
samples beyond a certain point are shifted by a constant value with respect to all 
samples before that point, as in Figure 10.91(b). This could be caused by a one-time 
mechanical failure in a sampling instrument, for example. 
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Most important to our study is the pattern Yen called recurrent nonuniform 
sampling which is the case when the sampling pattern is formed by a repeated cluster 
of N nonuniformly spaced samples, as in Figure 10.91(c). 

Yen showed how to reconstruct the function / from these samples, when it 
is bandlimited to a maximum frequency of u; max and the clusters are spaced in 
intervals N/wuj mzx . He introduced a minimum-energy constraint to obtain a unique 
interpolating function. The idea is that one can think of the sampling pattern as N 
separate uniform patterns, each with an intersample spacing of N/wu mAX * We can 
thus write the total sampling sequence as the sum of N subsequences r pm , each of 
which is an infinite set of periodic samples: 


N 


N oo 


/w — ^2 Tprn — ^2 ^2 

p= 1 p—1 m= — oo 

Then the reconstructed function f(t) is given by 

N oo 


N 

2 u; max 


f(t) — /( r pm)^ r pm(0 

p= 1 m= —oo 


where 




II sin (^r)( < -M 
0=1 ' ' 


(- 1 ) 


mN 


n n\ v 2 w) 


q-l^p 


(10.75) 


(10.76) 


(10.77) 


Jerri [231] notes that this technique is theoretically superior to alternatives such 
as low-pass filtering and spline interpolation, but Sankur and Gerhardt [375] report 
that it can be difficult or impractical to implement. Yen provides some reasons for this 
difficulty in his original paper. When the samples are closely bunched, the distances 
t p - t q become small. Since these values are in the denominator of Equation 10.77, 
it causes the values of the reconstruction formula at that point to become very large. 

When the signal values are amplified by a large value, any errors in the values 
are correspondingly amplified. Thus the values of bunched-up samples must be of 
increasingly higher precision as the bunching becomes more dense. Yen also notes 
that derivatives can be important in the reconstruction process, and that they can 
also impose precision requirements on the sample values. 


Pavlclc 

Pavicic investigated the flat-field response of several different radially symmetric 
filters and found their volume errors e v . 
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Radius 

Height 

0.00 

1.000 

0.25 

0.788 

0.50 

0.558 

0.75 

0.149 

1.00 

0.000 


MOURI 10.92 

Coefficients for a nonuniform cubic B-spline filter. 


He found his best filter by direct calculation; it is given by a nonuniform cubic 
B-spline with the coefficients in Figure 10.92. This filter has a volume V = 0.60. 
The flat-field response of this filter is given in Figure 10.93. 


Cook's Piitor 


Cook proposed reconstruction with a difference of Gaussians [101]. This radially 
symmetric filter is given as a function of r, the distance from the pixel center. The 
filter is drawn from a family of one-parameter filters based on w 9 the filter width. 
For image reconstruction, Cook sets w = 1.5. The filter is given by 


f(r) = 



— e 


r < w 

otherwise 


(10.78) 


and is shown in Figure 10.94. The volume of this filter is given by 

V = 7r [l — e _w ’ 2 (l + w 2 ) J 


(10.79) 


We can think of Equation 10.78 as a Gaussian bump that has been shifted down¬ 
ward by a constant and then windowed by a box. Thus, the filter does not blend 
away smoothly. Its derivative is 


J- = -2re~ r2 (10.80) 

dr 

so when the filter reaches 0 at r = w, it has a derivative of —2 we~ w2 . 

Because of the abrupt clipping created by the box, the filter will pass some 
high frequencies, and this is evident in the Fourier transform of the filter shown in 
Figure 10.94(b). 
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PI0IIRI 10.93 

The Pavicic filter, (a) Filter curve, (b) Fourier transform of (a), (c) 3D plot of flat-field response, 
(d) Contour plot of (c). 


Cook mentions two ways to apply the filter. In the first, we evaluate the filter 
value at each sample location and use that to weight the sample. The second method 
is appropriate when the samples have been jittered on a regular grid and the jitter 
distance is small compared to the filter width. Then the filter may be converted into 
a 2D table of values, which is placed over the desired reconstruction point. Each 
sample is weighted by the nearest stored filter value. 
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MOURI 10.94 

Cook’s difference of Gaussians filter, (a) Filter curve, (b) Fourier transform of (a), (c) 3D plot of 
flat-field response, (d) Contour plot of (c). 


Bovvillo ot ol. 

In Bouville et al. [59] the sampling geometry is expected to be a diamond lattice, 
and the resampling pattern is a rectangular lattice. In this way they distinguish 
the reconstruction filter from the following low-pass filter. In the paper they provide 
coefficients for filters of various sizes. The coefficients for the reconstruction filter are 
given in Figure 10.95. Note that the filter is vertically and horizontally symmetrical 
about the origin and designed for a diamond lattice; the entries of 0 correspond to 
sample locations that are not expected to have values. 
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PIOUII 10.95 

The coefficients of the 7 x 7 reconstruction filter. Source: Data from Bouville et al. in Proc. 
Eurographics *91. 


Dipp* and Wold 

Dippe and Wold [124] suggest the use of a one-parameter family of radially sym¬ 
metric raised cosine filters f(r). The filter parameter is w, which specifies the width. 
The filter is given by 


/(r) = 


AMf’O + i] 
0 


r < w 

otherwise 


(10.81) 


The volume of this filter is simply 


V = 7TW 


(10.82) 


Dippe and Wold [124] discuss how to choose w in Equation 10.81 on the basis 
of the local sampling density. Suppose that etot is the estimated RMS error in the 
signal within a filter of width wq , and we desire to reconstruct with an RMS error 
bounded by eb• Then the reconstruction filter width W is given by 


w = Wq 



(10.83) 


The response for this filter is given in Figure 10.96. 

They also present a discussion of another filter, based on an assumption that the 
signal is statistically stationary. Unfortunately, most signals in computer graphics 
(with the notable exception of noise textures) are not stationary, so the analysis is 
not directly applicable. 


Max 

Max [284] has noted that for image reconstruction, it is desirable to have the sum 
of the filters be C l smooth in the area between the samples, and observed that 




528 


10 SAMPLING AND RECONSTRUCTION TECHNIQUES 



FI8IIRI 10.96 

The Dippe and Wold filter for w = 1 . (a) Filter curve, (b) Fourier transform of (a), (c) 3D plot of 
flat-field response, (d) Contour plot of (c). 


the Pavicic filter described above does not fit that criterion. He proposed a class 
of radially symmetric two-parameter filters f(r) based on a defining curve g(r) that 
satisfies the following criteria (the notation g' refers to the derivative of g with respect 
to r): 

1 g(r) is a nonuniform quadratic spline. 

2 g(r) is downward sloping in [0, s] and upward sloping in [.s. t\ 9 where s,t elZ 
and 0 < s <t. 


a g(s) = 9{t)- 

4 g'(s) = g'(t). 

5 ff(0) = 1, g'(0 ) = 0, g(t) = 0, g'{t) = 0. 
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Conditions 1 and 2 set the general nature of the curve, which is defined in two parts, 
[0, s] and [s, t\. Conditions 3 and 4 guarantee that the two parts meet with C° and 
C l continuity, and the remaining conditions ensure that the curve starts at one and 
drops to 0, with a flat derivative at both ends. 

Max’s filter g(r) meeting these criteria is given by 


( 


9{r) = < 


i-q 

0 < r < s 

C t-r) 2 
t(t — s) 

s <r <t 

0 

t > r 


(10.84) 


This curve is plotted in Figure 10.97. 

To normalize the filter; we need to know its volume V", which is given by 


V(s,t) = 27t 




t 4 

12 t(t — s) 


s 2 t 2 2 s 3 t s 4 \ 

2 t(t — s) + 3 t(t — s) 4 t(t — s)) 

(10.85) 


We define the two-parameter filter family f(r) = (1 /V)g(r) y where each filter has 
unit volume. 

Max analyzed his filter using the same setup as Pavicic: four identical filters 
were placed at the corners of a square, and the field between them was analyzed for 
deviance from a perfectly flat sheet of height 1. Max searched for three different 
criteria: smallest contrast (maximum height minus minimum height), smallest value 
of V as defined by Pavicic, and smallest RMS error between the summed field and 1. 
The smallest value for each of these criteria, and the values of s and t where it was 
achieved, are given in Figure 10.98. 

Note that the error value of e v = 0.210 is much lower than Pavicic’s minimum of 
e v = 0.60. The response of this filter is shown in Figure 10.97. 


and Notravail 

Mitchell and Netravali [310] have developed a set of two-parameter filters appro¬ 
priate for many reconstruction tasks. 

The filters f(r) are based on two parameters, B and C, and are defined for the 
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(C) 


(d) 


FIGURE 10.97 

The Max filter with s = 0.4810 and t = 1.3712. (a) Filter curve, (b) Fourier transform of (a), 
(c) 3D plot of flat-field response, (d) Contour plot of (c). 


range |r| < 2. The filters are given by 


f (12 — 9B- 6C)|r| 3 

1 + (-18 + 12B + 6C)|r| 2 + (6 - 2B) ' * 1 

f(r) = g { (-B - 6C)|r| 3 + (6 B + 30C)|r| 2 < „ 

+ (-12B — 48C)|r| + (8B + 24C) “ “ 

' 0 otherwise 


The volume of this filter is given by 


V = 


2-k 


9 + 5B — 4C 
60 


( 10 . 86 ) 


(10.87) 


Equation 10.86 includes several well-known filters. The value ( B,C ) = (1,0) 
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Criterion 

value 

3 

t 


0.885 

0.4848 

1.3778 

*RMS 

0.245 

0.4810 

1.3712 


0.210 

0.4792 

1.3682 


PiailRI 10.90 

Results for the Max filter. 


is the cubic B-spline, (0, C) is the one-parameter family of cardinal cubic splines, 
(0.0.5) is the Catmull-Rom cubic spline, and (B, 0) contains the tensioned B-splines 
discussed by Duff [131]. 

Mitchell and Netravali attempted to characterize this 2D space of splines by an 
informal experiment. They selected an image for reconstruction, and on a single 
monitor showed a blurry, ringing, anisotropic, high-quality reconstructed image. In 
the center was shown an image reconstructed with the filter of Equation 10.86 for 
some values of B and (7, and subjects were asked to choose which reference image 
the center picture was most similar to. Their results are shown in Figure 10.99. 

The dotted line 2C -1- B = 1 in Figure 10.99 represents a line where an analysis 
suggests good splines can be found; notice that it contains the cubic B-spline (1,0) 
and the Catmull-Rom spline (0,0.5). Mitchell and Netravali suggest that the best 
trade-off may be found at (1/3,1/3), which is plotted in Figure 10.100. 

The frequency response of the filters in Equation 10.86 is given by 

O _ O D 

F(u) = —— p- [sine 2 (a;) - sinc(2u;)] 

2C 

+ -—^ [“3 sine 2 (2a;) + 2 sinc(2u;) -I- sinc(4a;)] 

-I- B sine 4 (a;) (10.88) 

The response of the (1/3,1/3) filter is plotted in Figure 10.100. 

10.10.7 Multistop Reconstruction 

All the reconstruction techniques discussed above place a copy of the appropriate 
filter over the resampling point and generate an interpolated value at that point. 
There is no preprocessing of the samples once they have been evaluated. 

This process works well when the sampling density under the filter is roughly 




MOUM 10*99 

( B , C) filter space. Note that the B parameter is on the vertical axis and the C parameter on the 
horizontal. Redrawn from Mitchell and Netravali in Computer Graphics (Proc. Siggraph *88), 
fig. 13, p.224. 

uniform. Then we can estimate the local sampling rate at the resample point, choose 
a filter of the appropriate width, and reconstruct. But when the sampling density 
under the filter is far from uniform, artifacts can appear in the reconstructed signal. 
We refer to the artifacts caused by variant sampling density in an image as grain 
noise [307]. 

If the sampling density is nearly uniform, we can suppress the artifacts by using 
a weighted-average filter [101]. In one dimension, we write the reconstructed signal 
r(x) as the product of the samples s(x) and the reconstruction filter f(x ), divided by 
the weights applied to the samples: 

y^/(g - x n )s(x n ) 

r(x) = -*=- (10.89) 

^f(x-X n ) 

n 

But as Mitchell points out [307], this filter does not handle extreme variations 
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(C) 


(d) 


Mtllll 10.100 

The Mitchell-Netravali cubic filter (1/3,1/3). (a) Filter curve, (b) Fourier transform of (a), (c) 3D 
plot of flat-field response, (d) Contour plot of (c). 


in local sample density. Figure 10.101 (color plate) shows this filter applied to a 
straight edge between a black region and a white one; the filter reconstructs a bumpy 
transition rather than a smooth one. 

This behavior comes about because the goal of adaptive nonuniform sampling is 
to gather information, and not to present that information in a way that is appropri¬ 
ate for reconstruction. When an adaptive, nonuniform sampling technique is used 
to sample the edge of Figure 10.101(a), samples of the signal are drawn until we can 
deduce the relative areas of the two regions. This deduction is made on the basis 
of the sample values and their locations, particularly where they group together. A 
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MGURI 10.102 

Sample clumping at an edge. 


dense set of black samples is likely to represent a black region of the image, but it is 
no more black than another region containing only one sample. 

Figure 10.102 shows one possible result of adaptively sampling an edge between 
a white region (with value 1) and a black region (with value 0) in a rectangular cell 
of edge length 1. We suppose that the initial sampling pattern placed two samples in 
this cell, and one each fell on the black and white regions. Suppose that a refinement 
algorithm was then invoked, and samples were drawn densely near the edge, where 
about half landed on the black side and half on the white side. Together; the sample 
values and locations do a pretty good job of representing the signal in this cell. 

Suppose we now filter this cell for reconstruction, using a box filter. The white 
region occupies the upper-left triangle of a square about 1/3 on a side, so the white 
area is about 1/18 « 0.056. But because our samples are almost evenly distributed 
in the two regions, we will get an average of about 0.5, which is about an order 
of magnitude of error (quantized to eight bits, this is a value of about 127 rather 
than the more accurate 14). We would get qualitatively similar results using the 
reconstruction filters described above. 

To handle this problem, Mitchell has proposed a multistage filter [307]. The 
filter is actually several box filters applied successively. Suppose that we want to 
reconstruct for new sample geometry on a 2D square grid with intersample distance d. 

The multistage filter begins by scanning the grid, taking steps d/4 in both x and 
y. At each step, a box filter with total width d/4 is applied to the signal; all the 
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The multistage filter spectrum. Redrawn from Mitchell in Computer Graphics (Proc. Siggraph 
9 871 p. 68. 


samples in this filter are averaged together and the result deposited back into that 
sample location. 

Next the image is scanned again, this time in increments of d/2 in both directions, 
and with a box filter with side lengths d/2 . Mitchell applies this filter twice. Finally, 
the grid is scanned with step d and a box filter d units on a side is used at each 
step. The filtered value is used as the final value for that sample, with the result of 
Figure 10.103 (color plate) for the step function. Each of these filters is normalized 
before application by dividing by the number of samples under it. 

The multistage filter works well even when the sampling rate is very nonuniform 
under any of the filters. This is because locally dense clusters of samples do not 
get the opportunity to overwhelm sparse samples by virtue of sheer numbers. The 
box-filter passes first average together the samples in regions that are very small with 
respect to the final sampling density. Even if there is great variance of sampling rate 
in the first-pass cells, each contributes only 1/16 to the final sample values, so a few 
errors at this stage don’t influence the final value too much. The next stages have the 
effect of smoothing the signal at the next level of granularity, since there can still be 
some clumping in one corner of a cell and not much information in another. Finally 
everything in a tile around the file sample is averaged. 

This intuitive explanation is supported by a theoretical argument based on multi¬ 
ple convolution by box filters. The multistage filter may be written as the convolution 
of four boxes, which is equivalent to a piecewise cubic filter. This filter is shown 
in Figure 10.104. The topic of multiple convolution with box functions has been 
studied extensively by Heckbert [204]. 







536 


10 SAMPLING AND RECONSTRUCTION TECHNIQUES 


Name 

f(r) 

Volume 

Width 

Pavicic 

! 

B-spline 

0.60 

1 

Cook 

f e~ r2 - e~ w2 r < w 
k 0 otherwise 

TT 

1 - e w2 (1 -1- w 2 ) 

w 

Dippe 

I[co.(2=r)+ll 

W l \ w / J 

nw 

w 

Max 

r 2 

1-0 < r < s 

st 

< (t rK «'<< 

| t(t-s) - - 

0 otherwise 

! 

2. ( 

1 

1 

! 

1 

f 2 4 

^ 2 4sl 

t 4 s 2 t 2 

" 12 t(t - s) 2 t(t - s) 

2 s 3 t s 4 \ 

f 3 t(t - s) 4 t{t - s) J 

t 

1 

Mitchell 

(12-9B-6C)r 3 
+ (-18+ 12B + 6C)r 2 r<1 

+ (6-2 B) 

1 J ( —S — 6C)r 3 

6 | + (60 + 30C)r 2 1 < r < 2 

+ (-12B-48C)r 
+ (80 + 24C) 

' 0 otherwise 

9 + 5B — 4C 

2? r - 

60 

2 

! 

i 

Multistage 

} 


variable 


A summary of the reconstruction filters in this section. 


liMMary 

A summary of the above filters is given in Figure 10.105. It would be very useful to be 
able to state at this point what the “best” filter is for all uses, and then recommend 
that filter for rendering programs. Unfortunately, this is almost impossible. As 
we saw earlier, all filter designs impose trade-offs among several different criteria. 
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which are often mutually antagonistic. My purpose here has been to present a small 
collection of generally useful filters, but no one is best for all uses. 

Having given that caveat, some guidance may be useful: I recommend the two 
filters at the bottom of Figure 10.105. The cubic filter of Mitchell and Netravali is 
probably the most extensively reported in graphics for image reconstruction, and it 
has good performance characteristics. The multistage algorithm handles variation in 
sampling density with a series of box filters; it’s therefore relatively easy to implement 
and fast to execute. 

Filters are complex and subtle, and a lot is known about both analog and digital 
filter design. Computer graphics has not used much of the classical digital filter 
repertoire, such as the filters discussed by Hamming [185], This is probably because 
we are often interested in the reconstruction of nonuniform samples, which has 
not been a big issue in most of the image processing literature. The great body of 
work in audio filters is also not appropriate for graphics work, because, as Mitchell 
and Netravali pointed out [310], those ID filters often produce sonically acceptable 
artifacts that are visually unacceptable. Once again the point is that whether or 
not a filter is “good” depends very strongly on the application. In this engineering 
discipline, nothing substitutes for experience, measurement, and careful observation. 

When selecting a reconstruction or low-pass filter, you need to balance off im¬ 
plementation and running time with performance issues. We often use fast (and 
somewhat sloppy) filters when reconstructing the illumination signal at a point, and 
put more attention on the image signal. This approach, while attractive in terms 
of performance, can seriously compromise the numerical accuracy of the rendered 
image. When it’s important to have a correct simulation of a real scene, it’s not 
enough to simply gather illumination carefully. The image needs to be processed 
properly during the shading step, which often requires reconstruction, filtering, and 
resampling. The filters described in this chapter can be used for that task as well. 


10.11 Further Reading 

A survey of filters and filtering techniques for image processing may be found in 
Pratt [345]. Multidimensional filtering is discussed by Dudgeon and Mersereau 
[130]. Efficient filtering methods often require sophisticated data structures for 
quick access to the relevant samples; a thorough discussion of data structures may be 
found in the two-volume set by Samet [373,374]. A variety of work on nonuniform 
reconstruction is summarized in Marvasti’s book [283], though it is difficult to 
obtain. A thorough discussion of various extensions to the sampling theorem is 
presented by Jerri [231]. 

The aliasing problem was first addressed in computer graphics by Crow in 1977 
[108]. A good discussion of practical anti-aliasing methods as of 1981 is given 
in a later paper by Crow [109]. Stochastic sampling was introduced to computer 
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graphics by Cook in a classic 1986 paper [101], but see the follow-up letters by 
Pavlidis [336] and Wold and Pepard [486], A survey of digital signal-processing 
methods for computer graphics is presented by Wolberg in his book [485], which 
includes many filter discussions and some source code for the FFT and various 
interpolation methods. Heckbert [206] also discusses nonuniform reconstruction 
and filtering. 

Many methods have been developed to avoid noise that may have crept into a 
signal being sampled. If the noise itself is of interest, Shapiro and Silverman have 
presented a means to sample that noise without aliasing [394]. This may be useful 
in characterizing the amount of noise in a signal prior to a step where it is removed. 
Cheung and Marks [87] have shown that under specific conditions, some samples 
may be disposed of from a sample set, effectively lowering the sample rate without 
introducing aliasing. 

Although the reconstruction problem is often phrased (as here) as though getting 
final sample values into the frame were the ultimate goal, those sample values when 
finally displayed are essentially filtered by the display device. This issue has been 
addressed in the graphics community in a letter by Pavlidis [336] and a paper by 
Kajiya and Ullner [238]. 

Most of the papers referenced in this chapter contain a discussion of the various 
signal-processing issues they address. For a good overview of some practical issues 
in signal processing in a rendering system, see Shirley and Wang’s paper [401]. 

10.12 IxerciMS 

IXGffClM 10.1 

Give a geometric proof of the hexagonal jittering formula in Equation 10.9. 

IxetcIm 10.2 

To sample a square domain using triangular sampling, the domain must be placed 
inside a triangle. Find the largest square than can fit inside an equilateral triangle. 
Give a formula for the number of samples inside the square as a function of the 
uniform subdivision level g of the triangle. 

Imrciso 10.3 

Implement the algorithm of Figure 10.22 to build a jittered sample pattern of samples 
within a unit square. Compute the largest possible value of the radius r p assuming 
hexagonal packing of N = 50 samples. 

(a) Using a radius r p / 2, run the algorithm five times with different starting seeds 
to generate N = 50 samples, and count how many samples are tested until 
fifty have been accepted. Plot a graph of number of samples accepted as a 
function of the number generated. 
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(a) (b) 


(a) Filling the plane by translating a single tile, (b) Filling the plane by flapping tiles across common 
borders. 


(b) Repeat (a) for a radius of 3r p /4. 

(c) Repeat (a) for a radius of 7r p /8. 

(d) Interpret your results for parts (a), (b), and (c). Is this a fair implementation 
of rejection sampling? Is this a good way to develop a dense pattern? 


IxeNtse 10.4 

Using one of your patterns from Exercise 3, create a sampling grid 10 x 10 squares 
on a side by translating your tile ninety-nine times, as in Figure 10.106(a). 

(a) Plot the log magnitude of the Fourier transform of your original tile. Interpret 
the distribution of energy represented by this transform. 

(b) Plot the log magnitude of the Fourier transform of the larger tiled domain. 
Interpret the distribution of energy represented by this transform. 

(c) Create a different 10 x 10 domain by reflecting the tile across each border, 
as in Figure 10.106. Plot the log magnitude of the Fourier transform of this 
domain, and interpret the distribution of energy represented by this transform. 

(d) Create a different 10 x 10 domain by choosing a random orientation of the 
tile in each cell, selecting from the eight possible transformations of a square. 
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Plot the log magnitude of the Fourier transform of this domain, and interpret 
the distribution of energy represented by this transform. 

IXMtlM 10.5 

Discuss the advantages and disadvantages of controlling sampling on the image plane 
based purely on the relationships of the intensities of the samples. Can this ever lead 
to wasteful computation? Can it ever neglect computation that should occur? 

IxtrclM 10.6 

Create a nonuniform sampling pattern by jittering a 32 x 32 grid of points in a unit 
square, and use this pattern to sample a 20 x 20 black-and-white checkerboard in 
the square. Reconstruct the signal using the following filters, and then resample the 
result on a 32 x 32 grid. 

(a) Nearest neighbor. 

(b) Abutting box filters. 

(c) Gaussians with a height of 1 and value 0.001 at radius 1/64. 

(d) Gaussians with a height of 1 and value 0.5 at radius 1/64. 

Compare the quality of the images produced by the different filters. 
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To the inspiration of Leonardo da Vinci 


Snow taken from the high peaks of mountains might be carried to hot places and let 
fall at festivals in open places at summer time . 


Leonardo da Vinci 





MATTER AND 


ENERGY 


I ask to have this much granted me—to assert that 
every ray passing through air of equal density 
throughout, travels in a straight line from its cause 
to the object or place it falls upon. 


Leonardo da Vinci 





INTRODUCTION TO UNIT III 


I n this part of the book we turn our attention to the physical world around us. 

Each chapter focuses on one part of a discussion that will lead us to the climax of 
the unit: the radiance equation . 

We begin with a study of light in its many forms. This leads us to transport theory , 
which is a means for quantifying the distribution of energy in an environment; light 
energy will ultimately be our main concern. To describe this distribution of light, we 
use ideas drawn from the field of radiometry , which provides us terms and units for 
discussing how much light energy of a particular type is moving from one place to 
another in a scene. For image synthesis we are quite interested in the interaction of 
light and a material , which is any physical substance that interacts with light. After 
we discuss the foundations of material structure, we will examine shading , which 
is a class of high-level techniques for modeling the interaction of light and matter. 
Finally, we observe that the equation that links all of these concepts is an integral 
equations finding a solution to this type of equation is the goal of image synthesis, 
so we discuss methods for solving integral equations in some detail. 

One way to look at a rendering algorithm is as a simulation of some model of 
the physics of light. We are completely free to choose how the physics will work: 
the everyday coarse physics of our universe is only one important example. But 
choosing the natural world as a driving problem has two benefits. The first is that 
we know what the world looks like, so we can use our own visual system and 
experience to debug our pictures. Computer graphics has benefited from a very high 
bandwidth channel for communicating the results of potentially millions of runs of 
our algorithms: a picture. If the picture doesn’t look right, we can form theories 
about our bugs just by looking. This isn’t a conclusive test that a program is correct, 
but it’s a great way of spotting many ways in which it may be incorrect. 

The second advantage of the natural world as our subject is that it is important 
in a practical sense, with applications from flight simulation to industrial design. 
Since the natural world comes from without, we have no more creative freedom 
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when using an accurate simulation model than we would have over materials in the 
real world. This tends to prevent certain kinds of artistic exploration. Nevertheless, 
simulations of the real world are more easily debugged and have wider immediate 
application, so they have become the most popular driving problem for rendering. 

Good mathematical treatments of the physics of surfaces and the physics of light 
have been around for a while; computer graphics has recently started to move deeper 
into this literature. It seems that the best way to describe how energy bounces around 
is with an equation that describes the relationship of all the light in the scene at once. 
This is an integral equation. 

Just as a differential equation expresses a function in terms of its derivatives, an 
integral equation expresses a function in terms of its integrals. Given a scene, we 
can write down the equation that precisely describes the energy at every point in the 
scene. To find the light anywhere in the scene, we need only get the value of the light 
function at that point and in that direction. 

Unfortunately, solving for this “radiance equation” analytically seems hopeless. 
Just as we saw for sampling, when we seek general solutions for complicated func¬ 
tions in multiple dimensions, numerical techniques offer hope where analytic meth¬ 
ods fail. 

Finding solutions to the m^gjighjr^^ equation is what rendering is all about. 
Almost every rendering algorithm published to date can be thought of as a solution 
technique for that equation. The recent link between the theory of integral equations 
and the indelightradiance equation has provided a solid basis of important results to 
guide our development of new algorithms. 

To see how the same function may admit different solutions, we can look at 
rendering from the point of view of a house painter. The customer has left the 
following painting instructions in an implicit form: “Place three coats of blue paint 
on all vertical walls.” The house painter has lots of ways to satisfy this requirement 
(that is, lots of solutions to the equation). For example, she can place blue dots at 
random all over the walls until statistically the average depth is three coats; she can 
paint all four walls once and then repeat that action two times; or she can paint 
each wall three times in a row before moving on. Each of these methods will give 
a slightly different result, and they offer different advantages; suppose the painter is 
concerned with the time it takes and the amount of exercise she gets while painting. 
Painting random dots keeps her moving but it’s slow, painting the same wall three 
times before moving on has little movement but it’s fastest; the other method is in 
between. When we look at solution methods to the mdelighjr^^ equation, we 
find we are interested in the running time, memory requirements, and generality of 
the solution, as well as artifacts or restrictions. 

This unit gives us the vocabulary for quantifying the distribution of light through¬ 
out an environment as it interacts with matter, and the mathematical tools for finding 
a description of this distribution that may be used for creating images. 



It probably doesn't matter if, while trying to he 
modest and eager watchers of life's many 
spectacles , we sometimes look clumsy or get 
dirty or ask stupid questions or reveal our 
ignorance or say the wrong thing or light up 
with wonder like the children we all are . 

Diane Ackerman 

("A Natural History of the Senses," 1990) 
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11.1 Introduction 

In image synthesis we are interested in the simulation of light . Light has a famous 
dual nature: in some situations it seems to behave as though it is a stream of particles, 
and in other situations as though it is a wave. This dual nature leads to phenomena 
that we see every day, from shiny pieces of metal to the colors in a bird’s feathers. 

We will briefly discuss both interpretations. In this book we will generally ignore 
the wave aspect of light, but it is important to justify that decision on physical 
grounds, and understand what we are giving up by doing so. 


11 .2 Tho Doublo-Slit Experiment 

We consider the wave nature of light first. We begin by noting that when waves 
of any sort pass through a small hole, or around an object with a sharp edge, they 
always tend to spread out. This physical phenomenon is called diffraction . In the 
1600s Huygens suggested that when a wave encounters an opaque barrier containing 
a very small hole, on the other side of the barrier the hole looks like a point source 
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HOUR! 11.1 

A small hole acts like a spherical point source of waves. 


of spherically symmetric waves, as in Figure 11.1. A single slit in a barrier acts as 
a source of cylindrical waves, as shown in Figure 11.2. This means that a sharp 
shadow turns into a fuzzy one some distance from the edge. 

In 1801 Thomas Young performed an elegant and influential experiment called 
the double-slit experiment , which argued strongly for the wave interpretation of the 
nature of light. 

Young’s experiment is illustrated schematically in Figure 11.3(a). Starting at the 
left of the experimental setup, sunlight strikes an opaque barrier with a single small 
vertical slit. At some distance beyond the first barrier is a second, opaque barrier, 
this time with two parallel slits. Beyond this second barrier sits a sheet of blank 
paper. When the slits are close together, the pattern of light striking the paper has 
the form shown in Figure 11.3(b). The remarkable thing about this pattern is that it 
consists of alternating bright and dark bands. The problem that confronted Young 
was how to explain this pattern. 

This can be done most easily by positing that the light exiting the two slits has a 
wavelike nature. That is, as light radiates from each slit, its energy at any point may 






PIOURI 11.2 

A small slit acts like a cylindrical source of waves. 



PIOURI 11.3 

The double-slit experiment, (a) Experimental setup, (b) Pattern on the recording screen. 
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A slice of the double-slit experiment. 


be described by a function that is periodic with time. Let us assume that this is the 
case. 

This assumption suggests there is a periodic function A(x,t) that gives us the 
energy of a beam of light at any point x and time t. If we fix x, then the function 
depends only on the time t. Suppose that this function is given by A(t) = sin(£). 

Returning now to the double-slit experiment, because the waves are cylindrically 
symmetric we can take an arbitrary cross section parallel to the axis of the waves to 
represent the general case; we do this in Figure 11.4. We set up a coordinate system 
with the origin between the slits, oriented as shown. The two slits are at positions 
(0, S\) and (0,52), and we are interested in the combined energy of the waves falling 
on some point P = (/,p). 

The two waves travel distances \\P — Si\\ and \\P — 521|. Now we can see the 
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reason for the single-slit barrier; this sets up the initial wave so that it strikes both 
slits at the same distance from the first slit, so the waves leaving the double slits 
have the same phase. In other words, at any time t> the same wave is generated in 
synchrony at both slits (which act as sources of cylindrical waves). 

Since the waves are periodic over distance, then at a given moment if we move 
along the receiving screen, we will sweep through the wave function, since the 
distance to the slit will be either smoothly increasing or decreasing depending on 
where we are and in which direction we’re moving. The distances to the two slits are 
different for every point except the one exactly between them, so we would expect 
in general that the amplitude of the received waves will vary with position. In fact, 
at some places the waves both arrive at their maximum amplitude. 

Suppose at some point P the distance d\ to slit 1 is given by d\ = fci27r + 7 t/ 2 and 
the distance to slit 2 is given by d 2 = fc 2 27r + 7r/2 for two integers k\ and k 2 . Then 
at this point the waves constructively interfere , since sin(di) + sin(d 2 ) = 1 + 1 = 2, 
and we get a bright spot. But suppose at some other point Q one wave arrives at its 
maximum and the other at its minimum, so d\ = fci27r + n/2 and d 2 = fc 2 27r + 37 t/ 2. 
Then sin(di) + sin(d 2 ) = 1 — 1 = 0. In this case the waves destructively interfere 
and we get a dark spot. Between these extremes we get different intensities due to 
different amounts of interference between the two waves. 

This light-and-dark pattern of interference fringes argued strongly that we should 
interpret light as a wave phenomenon, since the wave theory explains the physical 
phenomena accurately and elegantly. The study of the wave nature of light is called 
physical optics. 

Throughout this book, we will use the symbol v to refer to the frequency of a 
beam of light (the symbol / is also common for this term, but we reserve that to 
stand for functions). The distance traveled by a beam during the time it takes to 
oscillate through one period is the wavelength A. If light has a speed of propagation 
c in a particular medium, then c = Xv. 


11.3 Tho Wcnro Nature off Light 

We will find it useful to develop a basic understanding of the wave nature of light. 
This will allow us to understand the phenomenon of polarization , and its role in 
various shading models discussed in Chapter 15. 

Following the classical approach of Bohren and Huffman [53], we describe light 
as energy carried by a pair of coupled fields: the electric field E and the magnetic field 
H. In general, both E and H are complex-valued functions of space and time. They 
are described by a set of four famous equations known as Maxwell's equations , which 
lay down the principles for the behavior of electricity and magnetism. Because the 
two fields are intimately coupled (one never appears without the other), the single 
term electromagnetic is often used to describe this energy. Both the electric and 
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magnetic fields may be modeled as time-harmonic fields , that is, periodic functions 
of time and space. The Poynting vector S = ExH indicates the magnitude and 
direction of propagation of the transfer of electromagnetic energy. 

The simplest time-harmonic fields are the plane waves , which are nothing but the 
complex sinusoids from Unit II in a slightly different form: 

E - Eoe“ J ' (k ‘ x+wf) 

H = H 0 e _j(k ' x+UJ ^ (11.1) 

Here we have made the exponential depend on a wave vector k that describes the 
direction of propagation of the wave. 

This form of wave has an immediate physical interpretation. The complex ex¬ 
ponential sweeps out a coupled set of sine and cosine curves as a function of its 
argument. Consider for a moment just the first term in the argument -j(k * x 4- ut). 
If this term were only the spatial position x, then the wave would be spherical: at all 
points the same distance from the origin, the exponent would have the same value. 
Instead we are using k * x, which means that the value of the function at each spatial 
point x is found by projecting that point perpendicularly onto the direction vector 
k. In other words, k is the normal to a plane, and all points on that plane have 
the same value. The scalar offset for the plane is controlled by the second term, ut, 
which simply says that as the time t increases, each plane moves away at a speed u. 
Thus this complex exponential creates an endless series of moving planes of constant 
(complex) value. This is diagrammed in Figure 11.5. 

In general, k may be complex (k = k r 4 jk, ) for two real vectors k r and k f , so 
we can expand Equation 11.1 as 

E = E 0 exp[(ki • x) 4 j (k r • x - ut)] 

= E 0 exp[k T * x] exp[jk r * x - jut] 

H = H 0 exp[(k; • x) 4 j (k r • x - ut)] 

~ H 0 exp[k, * x] exp[jk r • x - jut] (11.2) 

where E 0 exp[k* * x] is the amplitude of the electric field, and <t> = k r * x - ut is the 
phase ; the same labels apply to the components of the magnetic field. 

Note that k t • x defines a plane with surface normal k*. Therefore k, is per¬ 
pendicular to surfaces of constant phase ; that is, all points x on that plane have 
the same phase (j). Similarly, k r is perpendicular to surfaces of constant amplitude. 
When k* and k r are parallel, we say the waves are homogeneous ; otherwise they are 
inhomogeneous. 

To see how the electromagnetic field moves through space, it is helpful to track 
a surface of constant phase. This is called a wavefront . This is just like watching 
the high points of the ripples created by a stone thrown into a pond; the top of a 
ripple forms a circle of constant phase, and following that point tells us something 
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MOURI 11.5 

The geometry of plane waves. 


about how fast the wave is traveling. Consider a plane wave moving in a direction 
parallel to the Z axis (that is, k r = (0,0,1)). At some time t 0y it will have phase 
(p — k r z — cut. At some time to 4- A£, it will have moved a distance Az, but the phase 
is the same by definition: (p = k r (z 4- A z) — u j(t 4- At). The phase velocity v is the 
speed of this surface. Equating the two expressions for 4>, we find 


A z u) 



(11.3) 


which defines the velocity (in direction k r ) of the wavefront. 

To proceed, we turn to Maxwells equations [53] for electromagnetic energy. 
These equations are one of the crown jewels of physics, and represent in a compact 
and elegant manner important truths about our physical universe. Many different 
derivations of these equations are available in books on physics, as well as optics, 
communications, and electronics. Because they are fundamentally based on the wave 
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nature of light, and our concern in this book is almost exclusively on the particle 
nature of light, we will not see Maxwell’s equations again explicitly in this book. 

Because this is their only appearance, we will be content to simply state Maxwell’s 
equations here, in a form specialized for plane waves , since that’s all we care about at 
the moment. Maxwell’s equations for plane waves are four short equalities linking 
the electric field E, the magnetic field H, the wave vector k, the frequency of radiation 
u;, and a few physical constants that describe the material (or medium) through which 
the wave is propagating. These equations are 


k • E 0 = 0 (11.4) 

k • H 0 = 0 (11.5) 

k x Eq = upHo (11.6) 

k x H 0 = —u;cE 0 (11.7) 


where the phenomenological material parameters are as follows: 

p is the permeability 
o is the conductivity 
X is the electric susceptibility 

e = £o(l + x) + J a / u; is the (complex) permittivity (11.8) 

The three basic parameters p, a, and x specify the properties of a medium (or 
material) and characterize how it responds to electromagnetic energy of frequency uj. 

Equations 11.4 and 11.5 say that the wave vector k is perpendicular to both the 
electric and magnetic fields; such a wave is called transverse . They also imply that 
the electric and magnetic fields are perpendicular to each other (though if they are 
complex-valued, the interpretation of perpendicular doesn’t admit a simple physical 
picture). When the wave is homogeneous, then the two fields and the wave vector 
form a set of mutually perpendicular axes in 3D, as shown in Figure 11.6. 

We can boil down the material constants into a form that will prove more useful 
to us in graphics. First cross both sides of Equation 11.6 with the wave vector k: 

k x (k x E 0 ) = up(k x Ho) 

= -uj 2 epEo (11.9) 

where we have applied Equation 11.7. Recalling the vector identity 

A x (B x C) = B(A * C) - C(A * B) (11.10) 

and applying it to the above, 


k x (k x E 0 ) = k(k • E 0 ) - E 0 (k • k) 


(11.11) 
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The wave vector, electric field, and magnetic field are perpendicular. 


The first term on the right is 0 from Equation 11.4. Combining Equations 11.11 and 
11.9, we find 

k • k = k r 2 - ki 2 4* 2jk r * k* = u ) 2 efi (11.12) 

This equation tells us that the material properties e and (i will admit a plane wave 
with vectors k r and k t , providing they meet a specific condition. The medium does 
not uniquely specify a particular wave, but it does require that a wave meet this 
condition. 

It is common to rewrite Equation 11.12 as 


Ik) = |k r +jki\ = 


ujN 

c 


(11.13) 


where c is the speed of light in a vacuum, and the complex refractive index N is 
given by 



N = c>jeji = 


(11.14) 
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where £o and fio are the permittivity and permeability of a vacuum; their values are 
given in Table E.3. The complex number N is often written 

N = T) + jK (11.15) 

for T}, k > 0 . The real part of N, often represented by 77 , is frequently called the 
real index of refraction of the medium. The imaginary part k is called the extinction 
coefficient and represents how easily a wave can penetrate into the medium. Both of 
these coefficients are actually functions of wavelength and we will sometimes write 
them as 77 (A) and n(X) to keep this in mind. 


11.4 Polarization 

Time-varying electric and magnetic fields need not be radially symmetric around the 
direction of propagation. For example, Figure 11.7 shows the X and Y components 
of an electric field propagating in the Z direction. In Figure 11.7(a) the two fields are 
in phase ; the peaks and zero-crossings occur at the same location along the X axis. 
By contrast, in Figure 11.7(b) the two fields are out of phase, so that their peaks and 
zero-crossings are at different places. 

Consider just the electric field part of an electromagnetic wave traveling in the Z 
direction. Expanding out the complex exponential, 

E = Acos(kz — wt) — Bsin(kz — u;t) (11.16) 

We have seen above that a wavefront of constant phase moves through space; this 
suggests that if we examine one location in space and measure the field strength over 
time as it passes by us, it will move through the full cycle. For simplicity, we will 
look at the field at z = 0 [53]: 

E = Acos(ujt) + Bs\n(ut) (11.17) 

Equation 11.17 can be thought of as the curve swept out by the tip of the electric 
field in the plane 2 = 0 as it moves through space. This curve has the form of an 
ellipse. 

Figure 11.8 shows the variety of curves that can be generated by Equation 11.17. 
In general, if A ^ 0, B ^ 0 , A ^ B, then we get an ellipse. An ellipse may be 
described by three numbers: two axis lengths (a semi-major axis length labeled a, 
and a semi-minor axis length labeled 6 ), and an azimuth angle labeled xp (measured 
with respect to an arbitrary reference axis). If the two axes have equal nonzero 
lengths, A = B ^ 0 , then we get a circle. Finally, if A = 0 or B = 0 , the curve 
degenerates into a line. 

Each of these different curves may be swept out by the electric field as it passes 
through the plane. We use the shape to characterize the light, calling it elliptically 
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(a) The two fields are in phase, (b) The two fields are out of phase. 





FI G If It I 11.8 

(a) An ellipse, (b) A circle, (c) A tilted line, (d) An axis-aligned line. 

polarized , circularly polarized (including left- and right-handed circular polariza¬ 
tion), or linearly polarized as appropriate (the term plane polarized is a denigrated 
synonym for linearly polarized [53]). The curve swept out on a stationary plane, 
regardless of its shape, is called the vibration ellipse , and the values (a, 6 , if) are 
called the ellipsometric parameters . The shape and structure of the vibration ellipse 
reveals the relationship between the phases of different components of the electric 
field; this relationship is called the polarization of the field. 
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The two axes for measuring polarization. 


Polarization is important to image synthesis because some materials respond 
differently to light of different polarizations. If the real part r] of the complex index 
of refraction N varies with different forms of linearly polarized light, the material 
is said to be linearly birefringent. If the imaginary part varies, the material is 
linearly dichroic. Similarly, circularly birefringent and circularly dichroic materials 
are sensitive to the degree of circular polarization in the incident light. 

We typically think of polarization as the projection of the electric field onto two 
orthogonal vectors lying in the plane perpendicular to the direction of propagation, 
as in Figure 11.9. Arbitrarily, one of these is called the parallel axis and the other 
the perpendicular axis (these axes are also sometimes called horizontal and vertical). 
These labels on the axes are relative terms that don’t imply an absolute position, just 
that the axes are mutually perpendicular. Usually when a surface is involved the axes 
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are oriented so that they are parallel and perpendicular to the local tangent plane of 
the surface at a particular point. 

The projection of the field E onto these axes is written E\\ and E±_ for the parallel 
and perpendicular axes, respectively. In general, these will be functions of time. If 
£|[ (0 and E± (t) are completely correlated over time, the light is said to be polarized . 
If they are completely uncorrelated, the light is unpolarized . Light can be partially 
polarized , indicating any amount of coupling between the components (for example, 
the tip of the field may sweep out an ellipse that rotates slowly over time). 

The ellipsometric parameters (a, 6, ip) completely describe the state of polarization 
of the light. An alternative that is often seen in the literature is a set of four 
parameters called Stokes parameters , which are related to the components of the 
field as follows [53]: 

I = E|{E|| + E±E±_ 

Q = E\\ E\\ - E±E±_ 

U = E\\E± + E±E\\ 

V=j(E ]l E±-E 1 E {] ) (11.18) 

(Recall that z indicates the complex conjugate of a complex number 2.) These are 
related to the ellipsometric parameters as 

I — d 2 

Q = d 2 cos 2 a cos 2 ip 

U = d 2 cos 2a sin 2 xp 

V = d 2 sin2a (11.19) 


where 


d 2 = a 2 + b 2 

ip = azimuth angle, 0 < ip < 7t 

| tana| = ellipticity = b/a, 7 r /4 < a < it/A (11.20) 

One of the advantages of this form of representation is that we can compute with 
these parameters more conveniently than with the ellipsometric parameters. The 
four-vector (7, Q, U, V) can be treated as a column vector that specifies the polariza¬ 
tion state of a light beam, which is modified by 4 x 4 Mueller matrices that describe 
the effect an optical component has on the polarization of light passing through it. 
To compute the final polarization after a beam has been reflected or transmitted 
several times, we need only apply the same sequence of matrix transformations to 
the Stokes parameters. 

A simpler computational structure with similar purposes is given by Jones vectors 
and Jones matrices [148,311]. Consider again our wave from above, traveling along 
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the Z axis. We will assume the Y axis is the “perpendicular” direction, so X is the 
“parallel” direction. We can write the two components of this wave at position z 
directly: 

E x = Axe^ 2 '^ E y = A y e’ lkz -" t ++ ) (11.21) 

where we have introduced a phase-retarding factor of 0 into the Y component. In 
vector form, we can write these as 

E , - ( )«<“—> Ey = ( ^ )«** — + « = ( A ° )t ' 

( 11 . 22 ) 

For polarization, we only care about the phase difference </>, so we can ignore the 
exp\j(kz—u;t)] factor common to both components. Since we’re only concerned with 
the phase difference and not the amplitudes, for convenience we will set A x = A y = 1. 
The polarization of our light is then described by 

E *=(J) E »=(J*) (11 - 23 » 

These are called Jones vectors. 

These two vectors indicate different polarizations. If a wave is completely charac¬ 
terized by a single Jones vector (0,1)*, then it is linearly polarized. If we take a beam 
of light linearly polarized in the perpendicular direction (with vector (1,0)*), then 
adding the two beams corresponds to adding the two vectors, producing linearly light 
polarized light at a 45-degree angle, (1,1)*. Often Jones vectors are written so that 
they have unit length, so this vector would be written (l/\/2)(l, 1)*. Left-circularly 
polarized light with a phase angle of 7r/2 would be represented by a Jones vector of 
(l/\/2)(l,e J ^ 7r / 2 ^) < = (1/V2)(1JY. Similarly, right-circularly polarized light would 
be (l/>/2)(l 7 e J ^~ 7r / 2 ^) < = (l/\/2)(l, — j) 1 . Their sum is (l/2)(2,0)*. There is no 
Jones vector representation for unpolarized light. 

The action of an optical element on a beam of light with a given polarization may 
be represented by a Jones matrix , which is 2 x 2, and by convention premultiplies the 
Jones vector (that is, a vector V is transformed by a matrix M as V' = MV). Some 
examples of Jones matrices are given in Table 11.1. When many optical elements are 
combined in a series, their combined effect on the polarization of the incident light 
may be found from the matrix product of the elements, applied to the initial vector 
in the same order. 

For example, suppose we take a beam of light with initial Jones vector J = (u, v) 1 
and pass it through a horizontal polarizer H, which removes all but the horizontal 
component of the energy: 


■ 1 

0 ■ 

U _ 

u 

0 

0 

V 

0 


J' = HJ = 


(11.24) 
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Horizontal linear polarizer 
Vertical linear polarizer 
Right-circular polarizer 
Left-circular polarizer 


TASLI 11.1 

Examples of Jones matrices. 


The result is a horizontally polarized beam, as we would expect. 

If we now add a right-circular polarizer R after the horizontal polarizer, we find 

J' = RHJ = [ 1 j 1 [ J j! 1 [ u 1 = [ 1 • 0 1 [ U 1 = f U 1 = [ 1 / 

j i j [ o ojl^J [*"*•? o J L v J L ~3 V J L “ j v / u 

(H.25) 

which is a circularly polarized beam. If we reverse the order of the optical apparatus, 
we get 

T'-xiRT-f 1 0 1 r 1 -M M - T 1 j]\ u ]-\ u + j v ] n nn 

a -HRJ- [0 oJ[j — L° 0 JL V J - L 0 J 

which is horizontally polarized, albeit with complex amplitude. The fact that optical 
elements are not commutative is captured by the mathematics, since this property is 
shared by matrix multiplication. 

U.S The Photoelectric Effect 

Another simple experiment can be performed that seems to defy explanation if we 
think of light as a wave. 

Consider the experimental apparatus in Figure 11.10. A beam of light is directed 
onto a piece of metal (called the cathode); located off to one side we have a detection 
device capable of measuring the energy of any electrons that strike it. 

When we shine a beam of light onto the cathode, the detector instantly starts 
reporting electrons; apparently the light incident on the cathode triggers the expulsion 






PIOVRI 11.10 

Apparatus for observing the photoelectric effect. The cathode is c, the detector is D. 


of electrons from the metal. For every one of these electrons, we find 

E — hv — p (11.27) 

where E is the observed energy of the electron, v is the frequency of the incident 
energy (interpreted as a wave!), p is a constant characteristic of the metal, and h is 
a factor that seems constant for all metals and all wavelengths. This expulsion of 
electrons by light is called the photoelectric effect . 

If we perform this experiment repeatedly, two important phenomena become 
clear. First, the energy E of the electrons is independent of the amplitude of the 
incident beam. In other words, if we illuminate the metal sequentially with a 20- 
watt light bulb and then an otherwise identical 40-watt light bulb, we get more 
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electrons but their energy does not change. Second, even extremely low amplitudes 
of light produce some electrons. 

The wave theory is unable to account for these phenomena. For the first case, 
we would expect a stronger wave to impart more energy to the electrons as they 
are ejected from the metal. Second, if the incident energy is very low, then it would 
be spread all over the cathode, and nowhere would there be enough energy for an 
electron to actually manage to get away from the metal (from the above equation, 
that requires some energy characterized by p). 

Einstein postulated that the energy flowing along the incident beam is quantized 
into small, individual packets called photons. 1 When a photon collides with an elec¬ 
tron, it transfers its energy to the electron. This transfer cannot happen partially; all 
of the photon’s energy is contained in a single, indivisible packet, which is transferred 
either in its entirety or not at all. 

Since each photon interacts with each electron independently, we can see why 
increasing the number of photons in the incident beam does not increase the energy of 
the emitted electrons (though it produces more interactions and thus more electrons). 
If the energy of a photon is too low, it will be below the threshold energy (call it Wq) 
required to liberate an electron from the metal. In this case a dense beam of photons 
of very low energy will not cause the cathode to emit any electrons at all. On the 
other hand, if each photon has an energy E > Wo, then even a very sparse beam will 
trigger the emission of some electrons, each with an energy E -W 0 . 

We find from the experiment that the energy E of the electrons is related to the 
frequency v of the incident energy, again interpreted as a wave. This relationship is 
simply 

E — hu — hc/X (11.28) 

The constant h is now known to be one of the fundamental values that determines 
the structure of our physical universe. Known as Planck’s constant , it is tabulated in 
Table E.3 along with the other physical constants used in this book. 

We can show that each photon has an apparent mass of m = hv/c 2 . But this is not 
a mass in the conventional sense of something that may be held still and weighed. A 
photon is a composite entity of motion and energy; there is no such thing as a photon 
at rest. The photoelectric effect by itself does not prove the existence of photons (the 
photoelectric effect can be explained just based on the ideas of Planck’s constant) 
and indeed Einstein’s paper only discussed the energy of the radiation [427]. 

Other experimental evidence of the particle nature of light is provided by quanti¬ 
tative photochemistry, the Compton effect, the X-ray absorption edge, the Zeeman 
effect, and the Raman effect [478]. 

The study of the particle nature of light is called geometrical optics . 


1 Einstein’s Nobel prize in physics was awarded for his photon theory, not relativity. 
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11.6 Particle-Wave Duality 

Resolution of the dual nature of light is addressed by quantum optics . Although we 
will not go into this subject here, one basic idea is that a photon may be considered a 
small, physically localized wave packet . The packet is a wave but it does not extend 
infinitely. A highly readable and informative discussion of the resolution of these 
dual natures of light is given by Feynman in a wonderful little book on quantum 
electrodynamics [144]. 

In this book we will limit ourselves to the particle nature of light. The geometrical 
optics in this chapter will set the stage for this interpretation, and the particle-based 
transport theory in Chapter 12 will cement it. 

What we are excluding by this choice is all hope of cleanly modeling those phe¬ 
nomena of light that are not handled by the particle model, specifically interference 
and diffraction . These are not trivial phenomena. Interference accounts for the 
brilliant colors that we see in thin films, including peacock feathers, oil slicks, and 
soap bubbles. Diffraction is responsible for some (though not all) soft shadows and 
light bleeding around the edges of objects. 

The advantage of using geometrical optics is that they seem to be more amenable 
to direct simulation on a computer for the types of complex environments and shad¬ 
ing models that we use in computer graphics. A number of reports have been pub¬ 
lished in the image synthesis literature that use physical optics as an image-formation 
model [248,314], The results of this work have typically required enormous com¬ 
putational resources to produce results of significantly lower fidelity and complexity 
than those attainable by geometrical optics. 

Therefore we make a pragmatic choice, and select the particle model for its 
simplicity and power. To be blunt, we are simply saying that interference and 
diffraction are sufficiently infrequent or unimportant that we can afford to ignore 
them in our general theory. 


11 .7 Reflection and Transmission 

Reflection is the process whereby light of a specific wavelength incident on a material 
is at least partly propagated outward by the material without change in wavelength. 
We will have much more to say about reflection in Chapter 13, but for now we will 
simply discuss some of the larger-scale features of this interaction of light and matter. 
Most simple models of reflection distinguish a small number of categories that cover 
the various mechanisms by which light is propagated by a surface. These include 

Specular (also called regular , or mirror) reflection propagates light without scat¬ 
tering, as from the surface of a perfectly smooth mirror^ illustrated in Fig¬ 
ure 11.11(a). 
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Different forms of reflection, (a) Specular, (b) Diffuse, (c) Mixed, (d) Retro-reflection, (e) Gloss. 


Diffuse reflection sends light in all directions with equal energy; this is illustrated in 
Figure 11.11(b). 

Mixed reflection is a combination of the two types described above. In a material 
exhibiting mixed reflectance, its overall reflectance is given by a weighted 
combination of diffuse and specular components. An example is shown in 
Figure 11.11(c). 

Retro-reflection occurs when the incident energy is reflected in directions close to 
the incident direction, over a wide range of incident directions. Although 
almost all materials are retro-reflective to some extent, those that retro-reflect 
most of their incident energy are referred to as retro-reflectors. An example 
retro-reflection profile is shown in Figure 11.11(d). 

Gloss is defined as the property of a material surface that involves mixed reflec¬ 
tion and is responsible for a mirrorlike appearance of a rough surface. The 
characteristics of gloss are usually described with the term glossiness . 

There are five kinds of glossiness, each described by its own scale of degree: 
specular , sheen , contrast , directness of image, and absence of bloom [220,232]. A 
perfect mirror has unit gloss, and a perfect diffuser (such as that approximated by 
fine-ground glass) has zero gloss. The different types of gloss are measured by the 
ratio of reflected to incident light at certain standard angles [232]. Figure 11.12 
illustrates the following descriptions of different types of glossiness. In each case the 
incident and reflected vectors are coplanar but on opposite sides of the normal. The 




MOURI 11.12 

(a) Specular, (h) Sheen, (c) Contrast, (d) Distinctness of image, (e) Absence of bloom. Redrawn 
from Judd and Wyszecki, Color in Business , Science , and Industry\ table 3.1, p. 408. 
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incident energy is measured as and the reflected energy as $ r (or <t> r i and <t > r 2 
when necessary). The energy leaving in the direction of the normal is $ n . 

Specular: This measures the brightness of a highlight. The incident and reflected 
vectors are set at 60° from the normal. The gloss factor is given by 

Sheen: This is the brightness of a highlight at a glancing angle. The incident and 
reflected vectors are set at 85° from the normal. The gloss factor is given by 
Sr/*;. 

Contrast: This is the brightness of a highlight at a glancing angle. The incident and 
reflected vectors are set at 85° from the normal. The gloss factor is given by 

*r/*n. 

Distinctness of image: This measures the clarity of the highlight or the sharpness of 
its borders. The incident and reflected vectors are set at angles 8i and 6 r , which 
are only a few minutes of arc different with respect to the normal. The gloss 
factor is given by d$ r /dO r , which is the rate of change of the reflected energy 
with 8 r . 

Absence of bloom: This measures the haziness around the highlight. A reflected 
vector Ri is set at the reflected direction; the other, R 2 , is a few degrees off. 
The gloss factor is given by $ r 2 /$ri* 

If the reflected light has the same reflectance for all incident azimuth angles i/>, 
the reflection is termed isotropic ; otherwise it is anisotropic . 

Similarly, transmission (or refraction) is the process whereby light of a specific 
wavelength incident on the interface (or boundary) between two materials passes 
(or refracts) through the interface and into the other material without change in 
wavelength. Like reflection, there are several principal categories of transmission. 
These include 

Specular (or regular , or mirror) transmission propagates light into the new material 
without scattering, as when light passes into a clear sheet of glass. This mode 
is illustrated in Figure 11.13(a). 

Diffuse transmission is transmission on a macroscopic scale, without a specular 
component. As with reflection, diffuse transmission may be isotropic or 
anisotropic. For example, diffuse transmission is often used for “art glass,” 
to admit light but not permit clear visibility, such as for a shower door. This 
mode is illustrated in Figure 11.13(b). 

Mixed transmission is a combination of diffuse and specular transmission. Most 
natural materials that admit transmission propagate light with both charac¬ 
teristics. This mode is illustrated in Figure 11.13(c). 
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Different forms of transmission, (a) Specular, (b) Diffuse, (c) Mixed. 


Traditional computer graphics rendering systems have emphasized four of these 
main categories to model surfaces: diffuse and specular reflection, and diffuse and 
specular transmission. The geometries for these modes are simple and well under¬ 
stood; they are discussed below. 


11.8 Index off Refraction 

As discussed in Section 11.3, when light moves through a medium denser than a 
vacuum, its speed decreases. When the extinction coefficient k = 0, the ratio of the 
speed of electromagnetic energy through a medium to its speed in a vacuum is the 
simple index of refraction r] € Tl for that material: 

n (A) = — (11.29) 


where 


v\ is the velocity of light of wavelength A in the medium 
c is the speed of light in a vacuum 
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X-rays Far Near Visible Near Far Radio 

ultra- ultra- infrared infrared waves 

violet violet 


FIGURI 11.14 

The index of refraction as a function of wavelength. Redrawn from Jenkins and White, Fundamen¬ 
tals of Optics^ fig. 231, p. 478. 


Note that the index of refraction is a function of wavelength. Figure 11.14 shows 
a schematic representation of the index of refraction over a spectrum from the X 
rays to radio waves. 

Note that much of the curve is roughly flat with a downward slope; these are 
called regions of normal dispersion . There are also places where the curve takes 
a sudden dip and then rises significantly over a short interval before flattening out 
again; these are regions of so-called anomalous dispersion . This latter name comes 
from the fact that in this region, longer wavelengths are refracted more than shorter 
ones. However, every substance has such a region at some wavelength, so the 
phenomenon is actually quite normal [230]. 

Notice that sometimes the index of refraction dips below 1.0, implying that light 
of that frequency will move through the medium faster than light in a vacuum. Al¬ 
though this appears to violate a basic principle of relativity, this mathematical oddity 
doesn’t represent an actual transfer of information. Relativity only places an upper 
limit on the speed with which energy is conveyed from one place to another, and 
this speed never exceeds the speed of light in the given medium. The essence of the 
reasoning lies in the concept of the phase velocity of superimposed waves. Expla¬ 
nations of anomalous dispersion, indices of refraction below 1, and superluminal 
phase velocity are not relevant to our needs in this book, since they rarely occur 
in the visible band. Detailed discussions of these phenomena may be found in the 
optics texts mentioned in the Further Reading section. 
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11.8.1 Sellmolor's Formula 

A good approximation to the index of refraction curve was given by Sellmeier in 1871 
[230]. He proposed that the dip in refractive index was due to selective absorption 
of particles that vibrate at a certain natural frequency He suggested that energy 
passing through a material with such particles might resonate with them, producing 
both constructive and destructive interference. For a single resonance frequency i^o, 
Sellmeier’s equation is 

= l + (U ' 30) 

where 


A 0 is the wavelength of light with frequency v$: Ao = c/v$ 
A is a constant for each material 


If there are several resonant frequencies, Sellmeier’s equation may be written as a 
summation of resonance terms: 


1 A = l + 


A;A 2 
A 2 - A 2 

i=i * 


(11.31) 


We will assume that most optical materials have only one absorption band near 
the visible region, so we will use the one-term form of Equation 11.30 in the following 
discussions. 

Consider again Figure 11.14. Notice that as A —► 0, rj —> 1; as A —» 00,7/ —► 
14- A. Equation 11.31 agrees exactly with the results of an analysis based on 
electromagnetic theory with some simplifying assumptions. 

Differentiating Equation 11.30 with respect to A yields 


-2A\ 3 


2AX 


dq a 
dX 


' 2 + 


(A - Ao y 


'1 + 


A\ 2 


A 2 - Ao 2 


(11.32) 


Equation 11.32 shows that the change in the index of refraction varies as a function 
of the third power of wavelength. Thus the index of refraction of a material is a 
strong function of wavelength and should not be approximated by a single number. 

To use Sellmeier’s equation, we must obtain values for A and Ao 2 (note that we 
never need Ao itself, only its squared value). These values can be found by writing 
Equation 11.30 twice, at two different wavelengths for which the index of refraction 
is known, and solving simultaneously. We write 

AX 1 2 

A1 2 - Ao 2 


m 2 = 1 + 
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V2 2 = 1 + 


AX 2 2 
^2 2 — Aq 2 


Solving for A and then Aq 2 , we find 


v .*- 17 


A = 


T-V 

(Vi 2 ~ D(Ax 2 - A 0 2 ) 


where 


S = ( 7 / 1 2 — 1)Ai 2 A2 2 
T = ( 7 / 1 2 - l)Ai 2 
U = (t/i 2 — 1)Ai 2 A2 2 

v = (m 2 - i)a 2 2 


(11.33) 


(11.34) 




The computation of A and Ao 2 can be made efficient by making use of common 
subexpressions. Applying Sellmeier’s formula is simply an application of Equa¬ 
tion 11.30. 

Sellmeier’s formula is accurate and theoretically justifiable. However, finding 
r]x requires a square root. We might be tempted to wonder if there is a good 
approximation to this formula that avoids the computationally expensive square 
root. The answer is yes, and it is to be found in Cauchy's formula . 


11.8.2 Crnhy'i Formula 

Cauchy’s formula is only accurate in regions of normal dispersion. Computer graph¬ 
ics is fortunate that most materials do not have a region of anomalous dispersion 
near or in the visible band. Figure 11.15 shows the refractive index for several 
materials in the visible band. 

We can simplify Sellmeier’s equation to take advantage of the relative flatness of 
most refractive index curves in the visible band. Rewrite Equation 11.30 as 


V\ 2 = 1 + 


A 

(l — Ao 2 /A 2 ) 


Expand Equation 11.36 with the binomial theorem: 

Vx 2 = 1 + A ^1 + -jy 4- -^ 5 - + • • 


(11.36) 


(11.37) 






11.8 Index of Refraction 


571 



- X 


HOUR! 1 1 . 1 S 

The refractive index in the visible band for several materials. Redrawn from Jenkins and White, 
Fundamentals of Optics, fig. 23B, p. 466. 


When A » Ao, then Ao/A —► 0, so we may truncate the higher-order terms, leaving 

t, a 2 = 1 +A + A^ (11.38) 

Writing M = 1 + A and N = A\o 2 , 

rjx = (M + N\~ 2 )* (11.39) 

Again using the binomial theorem, this expands to 


t/a = A/5 + 


N AT 2 

2A/5A 2 8M$A 4 


(11.40) 
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If we again ignore high-order terms and retain only the first three, we obtain 

77 (A) = >1 -h — + — (11.41) 

Equation 11.41 was first given by Cauchy in 1836. It only holds in regions of 
normal dispersion, and even there it is not as accurate as Sellmeier’s equation, but it 
is a useful approximation [230]. 

To find the coefficients for Cauchy’s equation for some material, select three 
wavelengths Ai, A 2 , and A 3 for which the associated indices of refraction rj(\i) = 771 , 
77(A 2 ) = 772 , and 77(A 3 ) = r ]3 are known. Then write the three simultaneous linear 
equations implied by these relations and solve them for A, £, and C: 

■ rtAi) 1 r 1 i/aj 2 i/Aj 4 i r ■ 

i?(A a ) = 1 1 /A 2 2 1 /A 2 4 B (11.42) 

A 3 ) _ _ 1 I/A 3 2 I/A 3 4 _ C 

In matrix form, we may write N = LA, so A = L - 1 N. Inversion of the matrix 
and expansion gives the following explicit formulas for i4, B> and C in terms of the 
indices of refraction at the selected wavelengths: 


where 


A = k[r]i(sv — tu) + rj2(ru — qv ) + r]3(qt — rs)] 
B = k[r)i(t - u) + 7 ] 2 {v - r) + 77 3 (r - *)] 

C = k[rji{u - s) 4- rj 2 (q - u) -b 77 3 (s - q)] 


q = I/A 1 2 
5 = 1/A 2 2 
u = 1/A 3 2 



r = I/A 1 4 = q 2 
t = 1/A 2 4 = s 2 
v = 1/A 3 4 = u 2 


(11.43) 

(11.44) 

(11.45) 


(11.46) 


11.9 Computing Specular Vectors 

When a ray of light is specularly reflected from a surface, it leaves the surface in 
a well-defined direction that is determined by the surface normal and the angle of 
incidence. Similarly, transmission is defined by the angle of incidence, the normal, 
and the indices of refraction of the two materials. We will construct these two vectors 
below. _ 

In this section we will identify vectors of unit length with a hat. So |N| = 1, but 
|N| can be any nonzero real number. For convenience, we will draw all vectors as 
radiating outward from the shading point. This means that the vector I representing 
the incident light is drawn pointing toward the source of the light, in exactly the 
opposite direction of the travel of light itself. 
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PlOlltl 11.16 

Geometry of specular reflection, (a) The vectors I, N, and R. (b) The parallelogram formed by I 
and R. 


11.9.1 Tko Reflected Vector 

The two experimental facts that allow us to construct the specularly reflected vector 
R for an incident vector I and a given normal N are that the three vectors are all 
coplanar, and that I • N = R • N. Using these constraints, Figure 11.16(a) shows the 
three vectors I representing the incident light direction, N representing the surface 
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normal, and R representing the reflected direction. Notice that they all have unit 
length. 

Figure 11.16(b) shows the parallelogram formed by I and R. The vertical diago¬ 
nal of this parallelogram is given by 2N', where N' is a scaled version of the normal 

vector N: N'= (i-N) N. 

From the parallelogram, we can see that 

R + I = 2N' (11.47) 


or 

R = 2N'-I 

= 2 (I • n) N -1 (11.48) 

This is the form used by programs. 


11 . 9.2 Total Internal Raftection 

When light passes from one medium into a denser medium, its speed decreases. Thus 
we can never speak of “the speed of light” in the abstract; it must always be with 
respect to some medium. The most common reference medium is a perfect vacuum, 
though the speed of light through air is only slightly slower than through a vacuum. 

The surface where two media touch is called the interfaces thus, we see a change 
in the speed of light at any interface between two materials of different densities. 
One ramification of the change in speed is that light appears to bend when passing 
through the interface. The amount of this bending, or refraction , is determined by 
the indices of refraction of the materials on both sides of the interface. 

The most basic physical law governing the geometry of refraction was described 
by Willebrord Snell of the University of Leyden, Holland, in an unpublished paper 
in 1621 [230]. Descartes later formulated a version of the relation based on the ratio 
of sines of the involved angles. The law relating the angles is thus variously known 
as Snell’s law and Descartes’ law. 

This law of refraction may be stated as 

Tji sin = 7] t sin 0 t (11.49) 


where 

0{ is the angle between an incoming ray and the normal at the interface 
0 t is the angle between the transmitted ray and the reversed normal 
rji is the simple index of refraction for the incident medium 

rjt is the simple index of refraction for the medium into which the light is transmitted 
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Snell’s law. 


The construction for this law is shown in Figure 11.17. The indices of refraction of 
the media on the incident and transmitted side of the interface are given, respectively, 
by 7]i and 77 *. 

Figure 11.18(a) shows the result of some rays passing from one medium to a 
denser medium. In general, the transmitted ray is bent to lie closer to the surface 
normal than the incident ray. 

Figure 11.18(b) shows the path of several rays traveling from a dense material 
into a less-dense medium. In general, the transmitted ray is bent further from the 
surface normal than the incident ray. An implication of this statement is that at 
some angle, called the critical angle , the light is bent to lie exactly in the plane 
perpendicular to the normal at the point where the incident ray strikes the interface. 
At all angles greater than this the light is reflected back into the original medium. 
This phenomenon is called total internal reflection (TIR). From Snell’s law, we find 
that the critical angle ^ may be found from 
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PIOURI 11.18 

Refraction, (a) Transmission into a denser medium, (b) Transmission into a rarer medium. 


i]i sin 4> c = r\t sin(7r/2) 

sin (,b c = — CULSfi) 

Vi 

In words, the critical angle is the smallest angle of incidence, in the denser material, 
for which light is totally reflected. Correct detection and handling of total internal 
reflection is critical for creating realistic images of transparent object. 

Although Snell’s law is typically written as in Equation 11.49, a more precise 
statement is 

I?* (A) sin ijj = 77 ; (A) sin 77 ; 11L5JU 

where the dependence of the indices of refraction on wavelength is made explicit. 


11.9.3 Transmitted Vector 

The graphical construction for the transmitted vector is a bit more complex than that 
for the reflected vector. Our experimental data is the coplanarity of the incident, 
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MOURI 11.19 

The geometry of specular transmission. 


normal and transmitted vectors, and Snell’s law. Our construction of T is based on 
the derivation given in Heckbert [209]. 

Figure 11.19 shows the four vectors I, M, N, and T where I and N are the same as 
before, T is the transmitted vector, and M will be constructed below. Our approach 
will be to decompose T into two vectors T = Tx + Ty, which are respectively 
perpendicular and parallel to the normal. 

We begin by constructing the vector M. This is a unit vector in the plane of the 

interface on the same side of the normal as T. We find M by first projecting I into 

the interface, normalizing the result, and then reflecting it. The projection of I into 
the interface gives us the (nonunit) vector lx: 

lx =I-cos0iN (11.52) 

By construction we can see that this vector has length sin#*, so we divide by this 
magnitude and multiply by —1 to flip it around and get the unit-length vector M: 

M = -PV(I —cosfljN) (11.53) 

sin 0i 

Now we can see from the construction that |Tx| = cos6^ and |Ty| = sin#*. 
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Further, Ty is antiparallel to N and Tj_ is parallel to M, so 


T = T x + T|| 

= M sin 0 t — N cos 0* 

= ~ Sm / t (I - COS 6>,N) - N cos e t (11.54) 

sin 


Equation 11.54 is a perfectly valid expression for T, but it requires us to compute 
a few sines and cosines we would like to avoid. The only such expression that’s 
computationally convenient is cos 0* = I • N, so it would be nice to get everything in 
terms of cos#*. 

We begin by noticing that sin 0*/sin 0* = r]i/r]t from Snell’s law, so plugging this 
in, expanding the terms, and collecting for N, we find 

T = ——(I — cosfljN) - N cos 9 t 

Vt 

= -—I + N f — costfj - cos6> f ) (11.55) 

Vt \Vt ) 

The only thing left is to express cos0* in terms of cos0j. We can do this using some 
trig substitutions: 


5 0* = sj 1 — sin 2 0* 






(1 — COS 2 0i ) 

Putting this back into the expression for T, we find 
T 


“ -| I+S (5 cos9 ‘ - f-W 


(11.56) 


(11.57) 


Note that (l - {tii/Vt) 2 ( 1 — cos 2 8,)) may be negative. This is our signal that total 
internal reflection has occurred. 


11.10 Further Reading 

The information in this chapter is common to most basic books on optics. Some 
well-known examples include the books by Born and Wolf [55], Hecht and Zajac 
[201], Jenkins and White [230], Williams and Becklund [478], and Moller [311]. 
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Review guides such as Hecht’s study outline [200] discuss the basic ideas and 
include worked problem sets with discussion. 

A variety of multiple-slit experiments are discussed very nicely by Moller [311]. 
Feynman has written a highly readable lay account of the dual nature of light in 
his book QED [144], which explains some essential parts of the quantum theory 
addressing wave/particle duality. Crystals are an especially interesting and useful 
class of materials; the interaction of light and crystals is discussed at length by Wood 
[488]. A discussion of light from the viewpoint of modern quantum mechanics may 
be found in the book by Sudbery [427]. 

11.11 Exercises 

IxorclM 11.1 

Using only the two formulas sin 0* = sin 0 r and R = al + 6iV, provide an algebraic 
derivation of the formula for the reflected ray R. (Do not use a geometric construc¬ 
tion; you may use trig identities and the basic properties of vectors, though.) 

IxorclM 11.2 

Using only the two formulas rji sin 0, = rj r sin 0 r and T = al -f WV, provide an 
algebraic derivation of the formula for the refracted ray T. (Do not use a geometric 
construction; you may use trig identities and the basic properties of vectors, though.) 
If there are choices to be made at some steps, explain your reasoning. 

IxorclM 11.3 

We assumed in Section 11.8.1 that it was reasonable to use only one term of Sell- 
meier’s formula to compute the index of refraction. It might be argued that the 
two-term formula is likely to be superior, particularly if it uses one absorption band 
on each side of the visible region. Do you agree with this argument? Under all 
circumstances? Assuming finite-precision arithmetic, are there situations when the 
two-term form is superior? Are there times it doesn’t matter? Find expressions for 
the four constants in the two-part form (advice: use a symbolic math package). How 
much more computational cost is involved in evaluating this expression for different 
wavelengths? Is it worth it? 

IxorclM 11.4 

Study the phenomena of magneto-optics and electro-optics. Discuss how you would 
implement these effects. Are there any applications for this work? 

IxorclM 11*5 

Prove that when the radical in Equation 11.57 is exactly zero, we are at the critical 
angle 0* = 0 C . 




I never cared much for moonlit skies, 

I never wink back at fireflies; 

But now that the stars are in your eyes, 
Vm beginning to see the light. 

Harry James, Duke Ellington* Johnny Hodges, 
and Don George 

(“Fm Beginning To See rhe Light," 1944) 



ENERGY 


TRANSPORT 


12.1 Introduction 

In this chapter we look at a method for quantifying the passage of energy through 
a medium. We will assume that energy is quantized into small, discrete packets, 
which we will model as particles with particular properties. We will describe the 
flow of energy through a medium by simply keeping track of the number of particles 
flowing through each region of the medium. Of course, we will later interpret these 
particles as light particles (photons), but it takes no more effort to express the theory 
in general and then later reduce it to that special case. 

Techniques for analyzing the flow of moving particles in 3D environments have 
been developed in great detail in a number of fields. We will base our discussion on 
an approach known as transport theory . This approach has been developed largely 
for simulating the activity of neutrons in atomic reactors, but is appropriate to such 
varied phenomena as automobile traffic flow, the configuration of large molecules, 
gas and plasma dynamics, and (most importantly for us) light. 

The purpose of this chapter is to develop a general transport theory that is 
appropriate for modeling light energy. We will make a few basic assumptions about 
the properties of the particles and the media that are based on our knowledge of light, 
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but our discussion will be entirely in terms of abstract particles. We will develop 
a general transport equation that describes this energy flow. We will then cast this 
equation into a form that makes it amenable to solution by computer programs. 

This particle-based approach limits our theory to the phenomena described by 
geometrical optics . As mentioned in Chapter 11, by taking a particle theory approach 
to modeling light, we are excluding the possibility of treating light as a wave, and 
therefore we will not handle wave phenomena like diffraction and interference. 

After we discuss the quantitative measurement of light energy in Chapter 13, we 
will return to the results developed here to write a general equation that describes 
the distribution of light in a scene. 

The solution to that transport equation is the holy grail of photorealistic image 
synthesis: for every point in the environment, it completely describes the intensity of 
light at that point in every direction, wavelength, and polarization. This is the raw 
energy information that we use to construct a simulated image, because we typically 
imagine our image as the light distribution that would fall on some piece of film in 
space. We can simply use the light transport equation to find the description of the 
light at every point on that film. 

We begin our discussion with a very simple transport problem that we can solve 
analytically to give a general view of what ideas are involved and how they interact. 
Then we will generalize the problem to the more complex environments used in 
computer graphics. In general, these more complex problems will require numerical 
methods to find even approximate solutions. 


12.2 The Rod Model 

We will begin our discussion of transport theory with a simple model. Even though 
the geometry will be very restrictive, we will encounter all of the concepts that 
are important to a complete transport equation. Our presentation follows that of 
Wing [482]. 

We will study the flow of particles through a long, narrow, circular rod, as shown 
in Figure 12.1. The rod has length a and is parameterized by the distance x, such 
that the left end is x = 0 and the right end is x = a. The area of a cross section of 
the rod is A . 

We suppose that inside the rod there is only one type of particle, with the following 
properties: 

1 Each particle moves either to the left (parallel to the vector L) or to the right 
(parallel to the vector R). 

2 All particles move with the same speed c. 

3 Particles do not interact. Thus, two particles may pass through each other. 



12.3 


Particle Density and Flux 


583 



FI8URI 12.1 

A rod of length a. 


12.3 Particle Density and Flux 

We characterize the distribution of particles with two functions, one for each direc¬ 
tion of motion. The function pi{x) specifies the expected density (the number of 
particles per unit volume) flowing to the left at point x. Similarly, the function p r (x) 
specifies the expected density of particles flowing to the right at point x. 

We will also find it useful to describe how many particles are flowing through a 
cross section of the rod per unit time. For example, suppose we wish to know how 
many right-moving particles pass through the cross section of the rod at x = xo in a 
time interval At. Since the particles all move at a constant speed c, any particle within 
a distance of cAt to the left of xq will pass through xo in this time interval. The subrod 
over the interval [xo - cAt, xo] has a volume AcAt , as shown in Figure 12.2(a). Since 
there are p r (xo) right-moving particles per unit volume at xo, there are p r (xo)AcAt 
particles in this volume. 

Since p r (xo)AcAt right-moving particles will pass through the rod at xo in 
time A£, the number of particles passing through this point per unit time is sim¬ 
ply Acp r (xo)- The same analysis holds for the left-moving particles in the subrod 
[xo,xo -f cAt] illustrated in Figure 12.2(b). 

The number of particles per unit time crossing a piece of surface is called the flux 
(Latin for “flow”), usually symbolized by the Greek letter 


$fl(x) = Acp r {x) 
$l(x) = Acpi (x) 


( 12 . 1 ) 
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(a) The subrod to the left of xo. (b) The subrod to the right of xo. 


12.4 Scattering 

If the rod contains nothing (that is, it is internally a perfect vacuum), then any 
particles injected from either end will flow unimpeded to the other end (recall that 
by definition our particles pass through each other). However, if the rod contains 
a material that interacts with the particles, we must consider the results of those 
interactions. 

We will model these interactions statistically. Suppose that the medium consists 
of dense blobs of material separated by a vacuum. Then some particles will bypass 
all the blobs, while others will collide with one of these pockets of material. We 
posit a collision probability , denoted a, that specifies the probability that a particle 
will collide with a blob for each unit of the material traversed. Thus the probability 
that a right-moving particle injected into the left end of the rod will collide with 
the medium before it exits at the right end is aa; the probability that it will escape 
without collision is a(Lx&)* The probability a is called the cross section by physicists; 
because of the possible confusion of this term with geometric cross sections often 
discussed in computer graphics, we will not use that name in this book; we call a 
the scattering probability or the collision probability . If the value of a is the same 
in all directions (here only two, left and right), the material is said to be isotropic ; 
otherwise it is anisotropic . 

We will assume that when a particle collides with the material in the rod, only 
one of two basic results can occur. First, the particle can be absorbed . In this case, 
the particle disappears and is converted into some other form of energy, such as heat. 
Alternatively, the particle may be scattered . 

What happens when a particle is scattered depends on the medium. In general, 
one or more particles leave the collision site (or the event) in one or more directions. 
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FIGURI 12.3 

Our scattering rule results in two particles in opposite directions for each incident particle, (a) A 
collision: a right-moving particle strikes a blob, (b) The result: one particle in each direction leaves 
the collision site. 


For example, a perfectly elastic scattering event is like the collision of two billiard 
balls: no new particles are created, and the direction of the scattered particle may be 
predicted with confidence. Alternatively, the collision of a neutron with an atomic 
nucleus can result in fission , whereby several new neutrons are released in a variety 
of directions. These are two extremes; the way a material scatters particles is one of 
the basic parameters that characterizes the appearance of the material, as we will see 
later when we discuss shading models in Chapter 15. 

In the rod model, we will use the following scattering rule for all particles: 
whenever a particle is scattered, two particles leave the scattering site, one in each 
direction. This is illustrated in Figure 12.3. 

For the time being, we will suppose our medium has no absorption, and only 
scatters particles. 


12.4.1 Counting New Particles 

In the next section we will need to discover the probability that a particle entering 
a subrod [xo, xo + Ax] will be scattered. Consider first the right-moving flux at the 
right end xo -f Ax due to the right-moving particles entering at xo. Suppose n right- 
moving particles enter the left end of the subrod at x 0 . From the definition of the 
scattering probability cr, we expect that no Ax particles will be scattered. Since each 
scattering event produces exactly one right-moving particle, the scattered particles 
survive in their progeny, and we would expect n right-moving particles to exit the 
right side of the rod. 

Because of scattering, left-moving particles will also contribute to right-moving 
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Position 


MGURI 12.4 

A space-time diagram of scattering particles. Location is plotted horizontally, time vertically. The 
vertical stripes represent the location of unmoving blobs in the rod. 


flux at xo + Ax. How many right-moving particles will be generated for each left- 
moving particle? 

To answer this question, consider the result of a single collision: one left-moving 
particle enters the scattering event and two particles come out, one in each direction. 
If one of those new particles is again scattered, then we will have three particles in the 
system, and so on. Some of these will emerge at the right end, increasing the number 
of right-moving particles there. We can account for these higher-order effects with a 
diagram like Figure 12.4. 

In this figure, we plot the location of a particle on the X axis, as time flows along 
the Y axis. Here we begin with a single particle L u entering the rod at x -I- Ax, 
moving to the left with constant speed c. The particle strikes a blob centered at x\ 
(notice that since a blob doesn’t move over time, it is represented by a vertical stripe 
in the figure). Call this collision event S\. The probability P{S\) of this collision 
occurring is P(S i) = crAx. The result of the collision is that L\ is considered to 
be destroyed and replaced by a new pair of second-generation particles, L 2 and i? 2 > 
both leaving the event at constant speed c in opposite directions. This event has 
increased by one the number of right-moving particles in the system. 

We now want to consider what will happen if particle L 2 is itself scattered by 
striking another blob at X 2 < xi, destroying L 2 and replacing it with two new 
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third-generation particles L 3 and i? 3 . Call this event 52. This would add yet 
another right-moving particle to our system, though there would still be only one 
left-moving particle. The probability that S 2 will occur, given that S\ has occurred, 
is P(52|5i) = (x 2 — x\)aAx = acrAx. 

Therefore the total probability P(S 2 ) of event 52 occurring is given by the product 
of S 2 occurring given that S\ occurred, times the probability of S\ occurring: 

P(5 2 ) = P(5 2 |5!)P(5i) = (aAx){aaAx) = (aa 2 ){ Ax) 2 (12.2) 

The important thing to notice here is that the probability of a second scattering event 
is proportional to (Ax) 2 . The probability of a third event would be proportional to 
(Ax) 3 , and so on. 

We don’t need to keep explicit track of these higher-order terms. Because the 
events are separated by space, when the rod becomes small enough (that is, Ax —> 0), 
there is only room for one scattering event. Higher-order events are handled in 
different subrods. So we will abstract away all the higher-order terms into a single 
composite term O(Ax), which represents the probability of multiple collisions in a 
length Ax of material. 

In summary, we can then say that the expected number of right-moving particles 
produced by a single left-moving particle is crAx + O(Ax). 


12.5 Tho Scattering-Only Particle Distribution Equations 

We are now ready to start looking at the equations that describe the distribution of 
particles in the rod. 

We begin with a small piece of the rod over the interval [xo,xo -I- Ax]. Consider 
first what happens to the right-moving particles that enter this subrod at xo. We 
know that the number of right-moving particles at this point per unit time is given 
by the flux, 4>/?(xo), and we want to find $#(xo + Ax). 

We find the right-moving flux $r(xo 4- Ax) as the sum of three component fluxes, 
illustrated in Figure 12.5. 

1 $r u : the flux due to right-moving particles that enter from the left per unit time 
and are not scattered, and thus emerge unscathed . This is just the probability 
of a particle not scattering times the expected number of particles per unit time 
and area. This flux is given by $r u = $r{x o)(1 - crAx). 

2 $r s : the flux due to right-moving particles that enter from the left per unit 
time and are scattered, each producing one new right-moving particle. This 
flux is $r 3 = $r(xo)ctAx + 0( Ax). 

3 $l s : the flux due to left-moving particles that enter from the right per unit 
time and are scattered, each producing a new right-moving particle. The left- 
moving flux entering at the right side is $l(x -I- Ax). We expect the number of 
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MGURI 12.5 

The three components of flux in the rod. 


scattered particles per unit time to then be = $l(x o + Ax)aAx + O(Ax). 
Each of these collisions produces exactly one right-moving particle. 

Adding these together, we find the flux emerging from the right side of the subrod 
is 

$ R (x + Ax) = + $ R S + $ L S 

= $ R (x 0 )(l - a Ax) + $ R (xo)crAx + 0( Ax) 

+ * L (x 0 + Ax)aAx -f O(Ax) 

= ®r{x o) + $l{x o + Ax) a Ax + 0( Ax) (12.3) 

where we have rolled together all the higher-order terms into one 0( Ax) term. 

Writing x for xo, we can now subtract $ R {x) from both sides, divide through by 
Ax, and take the limit as Ax —> 0 (we assume that the necessary continuity and limit 
conditions are satisfied so that this is a well-defined set of operations): 

lim + -♦«(*> = lim ( 

Ax— >0 Ax Ax— >-0 y Ax 

^^ = <7<I> L (z) (12.4) 

Repeating this process for the left-moving particles emerging at x 0 , we find similar 
results for which differs only by sign: 

(12.5) 


















12.5 The Scattering-Only Particle Distribution Equations 


589 


Equations 12.4 and 12.5 form a pair of differential equations that describe the 
flux in the rod given that the rod’s only effect on the particles is to scatter them. 

Our next job is to solve these equations by finding an expression for the unknown 
flux. We will go after first. We begin by combining these equations into a 

single second-order differential equation for $r(x): 


cP&r(x) _ d d$ R (x) 
dx 2 dx dx 


so 


d , , ( vv d$ L (x) 
- („«■!(*» = 


= a{-cr<& R {x)) = -a 2 $ R (x) 
( 12 . 6 ) 


<P$r(x) 

dx 2 


+ (J 2 $r{x) = 0 


(12.7) 


Because this equation is of second order, we need two constraints to completely 
define it [60]. It is convenient to provide these constraints as boundary conditions 
that specify the flux at the two ends of the rod. Suppose we inject one left-moving 
particle per second at the right end and no right-moving particles at the left: 


®r{ 0 ) = 0 

$ L (a) = 1 (12.8) 

Our goal is now to find a solution $r{x) to Equation 12.7 using the boundary 
conditions in Equation 12.8. Although they are sometimes easier to specify, bound¬ 
ary condition problems are in general much harder to solve than initial condition 
problems. Fortunately, Equation 12.4 gives us an easy way to convert Equation 12.8 
into initial conditions: 

= <r*i,(a) = <r (12.9) 

For simplicity, in the next few paragraphs we will write y(x) for $r(x), and y'{x) 
and y ff {x) for its first and second derivatives. Then our problem may be stated as 
finding a function y(x) that satisfies 

y"(x) + a 2 y{x) = 0 
2 /( 0 ) = 0 

y'(a) = a (12.10) 

Equation 12.10 specifies an initial-value problem for a first-order linear homoge¬ 
neous differential equation with constant coefficients. Therefore we are tempted to 
try y(x) = e rx as a potential solution [60]. This leads to the trial solution 

r 2 e rx +a 2 e rx = 0 (12.11) 


which, after dividing by e rx ^ 0, results in the characteristic equation [60]: 


r 2 + cr 2 = 0 


(12.12) 
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which has roots r\ = jcr, r 2 = —jcr (recall that in this book j = >/—"1). So we have 
found two solutions: 


Vi(x) = e*° x 

y 2 (x) = e - j<TX (12.13) 

These functions contain j , which is rather awkward; we would prefer an equivalent 
form involving only real numbers. This form is easily found. 

We know from the theory of differential equations that all linear combinations 
of y\(x) and 2 / 2 ( 2 ) are also solutions to Equation 12.10. So we create two new 
functions, gi{x) and 02 ( 2 ), formed from the sum and difference of the previous 
solutions: 


gi(x) = 2/1(2) + 2/2(2) = e jax + e jax = 2 cos ax 
g 2 (x) = 2/i(2) - 2/2(2) = e J(TX - e~ 3ax = 2 jsincrx 


(12.14) 


using Euler’s identities for sine and cosine. Our general solution is thus a linear 
combination of these two solutions: 


2 /( 2 ) = C X gi(x) + C 2 g 2 (x) 

= ci cos ax + C2 sin ax 


(12.15) 


where we have rolled the constants from g\(x) and g 2 {x) into ci and c 2 . 

To find these constants we employ the initial values. First, noting that 2 /( 0 ) = 0, 


2 /( 0 ) = ci cos aO 4 - C2 sin aO 

0 = ci 

Second, we use the value of the derivative at a: 

y'{a ) = — Ci<r sin era + c 2 a cos aa 


(12.16) 


a = c 2 a cos aa 


1 


= c 2 


cos aa 

We now have our complete solution for y(x) = $r{x): 

sin ax 


®r( 2 ) = 


cosaa 


(12.17) 


(12.18) 


Once again we will use Equation 12.4, this time to find $l( 2 ) from $#( 2 ): 

* l(iC )_i*!sW_SS££f ,12.19) 

a ax cos aa 
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PIOURI 12.6 

(a) The flux $r(x) in the rod for a = 0.5 and different values of a. (b) The flux <&l(x) in the rod 
for a = 0.5 and different values of a. 


This completes our quest for the flux in the rod given these initial conditions. 
Figure 12.6 shows plots for this flux along the rod for different values of a. 

The solution in Equations 12.18 and 12.19 goes to infinity when a = n/ 2 a. We 
say that a rod with these boundary conditions and material is critical at this length. If 
a > 7 t/ 2 ct, the fluxes go negative, which is mathematically well defined but physically 
meaningless. Thus all rods of length a < n/ 2 a can maintain a steady state given this 
configuration; rods longer than that length are not physically realizable. 

Criticality tells us what’s happening to the flux in the system over time. If there are 
more losses than gains, the system is subcritical , and eventually the flow will damp 
out. If gains outnumber losses, then the system will avalanche (or experience a chain 
reaction ), producing ever more particles until saturation ; such a system is called 
supercritical . When gains and losses are balanced, then the system is self-sustaining, 
or simply critical [431]. 


12.6 A More Complete Medium 

We can generalize the result of the last section to a medium with richer properties. 
When we move to 3D, we will need to be able to account for particles traveling in 
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A subrod in the interval [xo, xo + Ax]. 


any direction. To prepare for that generalization, in this section we will refer to an 
arbitrary direction v, and some other direction v' ^ v. In the rod, when v is equal 
to L, then v' is R, and vice versa. 

We will characterize the medium by six properties and their effects on the flux. 
They are illustrated in Figure 12.7. 


Reflection $ r (x, v): The ends of the rod may return (or reflect ) some portion of their 
incident particles back into the rod. The albedo , denoted 0 < 0 < 1, describes 
this percentage. In our model, the ends of the rod have albedos 0o and 0 a - 
Any particles not returned by the rod ends are assumed to be absorbed. So 
the left-moving flux reflected back into the rod as right-moving flux at a = 0 
is $> r (0,R) = 0o®(O, L), and similarly, $ r (a, L) = /? a 3>(a, R). 

Surface emission <l> 5 (0, R),4> 5 (a,L): The end surfaces may emit (or introduce) par¬ 
ticles into the rod. The number of particles emitted per unit time is called the 
surface (or boundary) flux. At the left end (x = 0), the emitted particles move 
to the right and are characterized by $ 5 (0,R). At the right end (x = a), the 
emitted particles move to the left and are characterized by $ s (a, L). 

Absorption a a (x , v): When particles travel through the rod they may strike some of 
the rod material and be absorbed. In this case they simply disappear from the 
system, their energy typically converted into another form (such as heat). The 
probability of this absorption happening per unit length of the rod at location 
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x for left-moving particles is given by cr a (x, L). Similarly, the probability of 
absorption per unit length for right-moving particles is given by a a (x , R). 

Outscatter a s (x, v -» v'): When a particle strikes some piece of the material in the 
rod, it may change direction. We say it is outscattered , or backscattered . In 
the rod, this can only mean that the particle is sent back into the direction 
from which it came. If the particle originally was traveling in the direction v, 
then after backscattering it is sent away in v'. The probability of this type of 
scattering per unit length of the rod at location x is given by cr s (x, v v'). 

Inscatter a s (x,v' -» v): This is the opposite of outscatter. A particle traveling in 
direction v' may undergo a collision and be scattered into v. This is called 
inscattering , or forward-scattering . The same scattering function is used to 
characterize this behavior, and only the directions that parameterize it are re¬ 
versed, so <j s (x, v' -» v) describes the probability of inscatter per unit distance. 
If or s (x, v -» v') = a s (x , v' —> v) for all x £ [0, a], then the material is said to 
be isotropic with respect to scattering. 

Volumetric emission e(:r, v), 0 < x < a: The material may emit particles into the rod 
by virtue of some internal process. The probability of such an emission in 
direction v per unit length of the rod is given by e(x, v). 


Because the scattering functions are defined for an abstract direction v, a partic¬ 
ular scattering event can only be characterized as inscatter or outscatter if we specify 
v. For example, suppose a left-moving particle is scattered and leaves the event by 
moving to the right. If we are interested in finding the left-moving flux, then this 
event would be labeled as outscattering; if we were interested in the right-moving 
flux, it would be inscattering. 

There are two general approaches to characterizing the different scattering prob¬ 
abilities. One approach first determines the probability of any scattering event and 
then scales this probability by the relative probabilities of each type of scattering. 
The other approach simply writes each scattering probability directly, rather than as 
a fraction of a total scattering probability. We follow the latter approach here. 


12.6.1 Explicit Flux 

As in the previous section, we will now consider a small subrod and quantify the 
fluxes inside based on the arriving fluxes and the rod properties. Surface emission 
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and reflection are boundary conditions, so we will not use them here. They will be 
our principal subject in Section 12.10. 

We begin with a small subrod on the interval [xo,£o + Ax] illustrated in Fig¬ 
ure 12.7. In an internal piece of the rod some number of particles are assumed to 
be entering from both ends. These are the particles that are absorbed and scattered. 
The particles that are unaffected are said to stream through the volume. Thus the 
total number of particles exiting the rod is the sum of three positive terms (stream¬ 
ing, volumetric emission, and inscattering) and two negative terms (absorption and 
outscattering). 

For the moment, we will focus explicitly on the expression for the right-moving 
flux $(x -I- Ax, R) leaving the right side of the tube. That is, v = R, and v' = L. 
So we can write the right-moving flux as a sum of five fluxes, three positive and two 
negative: 

$(x + Ax, R) = streaming -I- emission -I- inscattering — absorption — outscattering 

^ 

( 12 . 20 ) 

We can now fill in each of the five terms in Equation 12.20. 

Streaming: This accounts for those particles that arrive at the left side and exit from 
the right side of the rod without any interaction with the material. We will 
assume that all such particles pass through and will use the other terms in 
Equation 12.20 to subtract out those that are absorbed or scattered. So the 
flux due to streaming is just the arriving flux $(x, R): 

$ s = $(x,R) (12.21) 


Emission: The volume emission term is simply the probability of emission per unit 
volume times the subrod’s volume: 

= e(x, R)Ax + 0( Ax) (12.22) 

Inscattering: The inscattered flux is based on the left-moving flux arriving from the 
right side of the rod at x -I- Ax times the probability that these particles will 
be inscattered and redirected from moving left into moving right: 

$i = $(x + Ax, L)<r,(x, L R)Ax + O(Ax) (12.23) 

Absorption: The absorbed flux is proportional to the absorption coefficient and the 
distance traveled by the incident particles: 


= 3>(x, R)cr a (x, R)Ax -I- O(Ax) 


(12.24) 
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Outscattering: The outscattered flux is based on the incident flux times the probabil¬ 
ity that each particle will be scattered from moving right to moving left, times 
the volume traversed: 

$ 0 = $(x, R)cr s (x,R -4 L)Ax 4 0( Ax) (12.25) 

We can now write out Equation 12.20 in detail: 


$(x 4 Ax, R) = 4- £** 4 

= $(x, R) 4- Ax [^(x, R) 4- $>(x 4 Ax, L)cr 5 (x, L -4 R)] 

- $(x, R) [a a (x 1 R) 4- cr s (x, R -4 L)1 
+ O(Ax) (12.26) 


We would like to find a solution to Equation 12.26. So, using the same limit 
argument that we used in the previous section, we subtract $(x, R) from both sides, 
divide through by Ax, and take the limit as Ax -4 0: 

d$(x,R) _ $(x + Ax,R)-$(x,R) 

dx Ax—►o Ax 

= e(x, R) 4 $(x, L)ct 5 (x, L -4 R) 

- $(x,R) [<r a (x,R) 4 (7 s (x, R —y L)] (12.27) 


We can repeat the whole analysis for the flux arriving at the right end of the 
subrod and exiting from the left end. This results in 

_d^x 1 L) = e(Xj L) + R) CTa ( X) R L) - $(x, L) [a a (x, L) + <r.(x, L -»■ R)] 
ax 

(12.28) 

Equations 12.27 and 12.28 are too complicated to give us any hope of continuing 
on along this line of thought. The presence of so many functions that are dependent 
on x makes it hopeless to search for a general, analytic solution. 


1 2 . 6.2 Implicit Flux 

We have just formed explicit expressions for the flux at one location along the rod 
based on the material properties and the flux at another location. We then saw that 
it would be very hard to solve the explicit equations. An alternative approach that 
will be useful in 3D is to find an implicit expression for this function. 

We posit that the particle flow in the tube has reached a steady state . That is, there 
are still particles moving back and forth, and emission, collisions, and absorption 
are all occurring, but if we look at the flow through any cross section of the rod, we 
find that the magnitude of this flow is constant over time. We say that the system is 
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in equilibrium , which we express by setting the time derivative of the flux to zero: 
d$/dt = 0. 

When the system is in equilibrium, the gains exactly balance the losses. The gains 
in a subrod are due to streaming, inscattering, and volumetric emission; losses come 
from absorption and outscattering. So the equilibrium condition for the rod states 
that 

$s{x, v) + $ e (x, v ) + $*(«, v ) = *o(«, v) 4- v ) (12.29) 

Expanding these terms yields an implicit formula for the flux; that is, the actual flux 
in the rod is described by a function that satisfies the equality. We won’t bother to 
expand the terms here since the complicated result will not be very intuitive right 
now. However, we will find that the implicit form is very attractive in 3D, because 
it may be expressed as an integral equation, which leads to efficient and intuitive 
solution algorithms. 


12.7 Particle Transport in 3D 

The rod model of the last section was very simple in many ways. In particular, 
particles could travel in only two directions, the only surfaces that interacted with 
the particles were the rod’s ends, and these ends were perpendicular to the flow. 
These conditions allowed us to write explicit expressions for both the left and right 
flux, and since these two expressions depended only on each other, we could solve 
them together and (sometimes) find analytic results for both $>(#, R) and $>(£, L). 

In the general 3D case, things are still straightforward, but we lose the geometric 
simplicity of the rod. In particular, there are an infinite number of directions in 
which particles can travel, there are a potentially infinite number of surfaces which 
can interact with particles, and these surfaces can be oriented in any direction. 

Although most of the concepts we will cover in 3D were introduced in our 
discussion of the rod model, the 3D setting requires more bookkeeping than the rod. 
This translates into a busier mathematical notation. 

We begin this section with a discussion of the mathematical ideas that we will 
need, mostly to allow us to label and selectively gather sets of directions, points, and 
surfaces. We then use this notation to derive the transport equation in 3D. 


12.7.1 Pelnts 

We will refer to a generic 3D point in space with the letter r; a point on a surface 
will be denoted s. A particular volume of space will be denoted V. A differential 
volume in space around r will generally be referred to as dr. Thus if we have some 
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scalar function /(r), we can find its integral F in a volume V with the expression 

'-/// / (r) dr (12.30) 

v 

This type of triple integral is so common that we will usually simplify the notation 
by dispensing with the three explicit integral signs, leaving it clear from context that 
the integral is over a volume of space. Thus we will more often write 

[ f (r) dr (12.31) 

Jv 

The domain 1Z 3 stands for all points in all of space. In a similar vein, when we want 
to integrate over a surface, we will write this as a single integral over the surface (say 
5), representing a double integral over the surface of 5. 

The set of all surfaces of the environment is denoted M; each individual surface 
is an A/j. We will assume that all surfaces are smooth and sufficiently well defined 
such that every point s on a surface M* has an associated surface normal n(s) (recall 
that all surface normals in this book have unit magnitude, so |n(s)| = 1, Vs € A/). 


1 2 . 7.2 Projected Areas 

A projected area describes how much of a piece of surface area is visible from a 
particular point of view. Consider a small planar patch of surface A with area \A\ 
and surface normal N, as in Figure 12.8(a). If we project A onto a plane that is 
perpendicular to its normal, the area of that projection is just the area of A itself. 
We will often be interested in the area of A when viewed from some other direction, 
say along a vector V. To find the area of A visible along V, we parallel-project 
A onto a plane perpendicular to V and calculate the area of the projection, as in 
Figure 12.8(b). 

The projected area of A in direction V, which we write as A v , is defined as 

,4 V = ,4(N • V) = ,4 cos 0 (12.32) 

To confirm the notation, observe that A N = A. 

Much of the published material in the radiometric and other literature simply 
writes A p for a projected area (the “p” stands for “projected”), where the reader 
is expected to figure out or remember what direction the patch is being projected 
into. Other notation includes the cos# term explicitly in all formulas. I prefer the 
notation A v presented here since it explicitly states both the patch and the direction 
of projection. We will expand the cosine term when it’s needed for manipulations. 

It is sometimes convenient to think of a projected area as a vector quantity with 
magnitude and direction corresponding to area and normal, respectively. 
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FIGURE 12.8 

Projected areas, (a) The area projected onto a plane parallel to itself, (b) When projected onto a 
nonparallel plane, the projected area diminishes. 


12.7.3 Directions 

A bold capital roman letter, such as V, stands for a vector. We often use vectors 
to stand for the flow of some material, since they indicate both the direction and 
magnitude of the flow. 

Often we care only about the direction. One approach is to use vector notation 
with no change. Some authors prefer to place a hat over a vector, such as V, to 
indicate that a vector has unit length. 

A popular alternative, which we adopt, is to use a slightly different notation for 
unit-length vectors, often called direction vectors , or simply directions. This notation 
will generalize below to the idea of a solid angle. 

We will denote a direction by a vector d;. By definition, |u;| = 1, so we can think 
of direction vectors as identifying points on a unit-radius sphere around the origin. 
For consistency with points, such vectors could be written in bold type, but bold 
Greek letters are sometimes difficult to distinguish from regular Greek letters. So we 
will place an arrow above each direction vector to remind us that it is not a scalar. 

Figure 12.9 shows a spherical coordinate system for representing directions, and 
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PIOURI 12.9 

A spherical coordinate system for locating directions. The angle 9 specifies the angle made by a 
direction Q with the 2 axis, and ip specifies the angle made by the projection of uj onto the xy plane 
with the x axis. 


a generic direction vector uj. Superimposed on this sphere is a set of left-handed 
Cartesian coordinates for reference. The angle 9 £ [0,7r] describes the angle made 
by uj with the 2 axis, and the angle ip 6 [0, 2n] describes the angle made by the 
projection of uj onto the xy plane with the x axis. Just as a 3D point may be viewed 
as a packaging of three scalar components, r = {r x ,r y ,r z ), so is the direction a 
combination of two scalars: uj = (ujq, uj^). Often it is convenient to think of direction 
vectors in spherical coordinates, and vectors such as V in rectilinear coordinates, 
although of course either form can be expressed in either system. 


12.7.4 Solid Angles 

A solid angle is the 3D analog to the familiar 2D concept of angle. Consider some 
2D object viewed from a point. We can draw a circle around that point and then 
identify the range of the circle isolated by the radial projection of that object, as in 
Figure 12.10(a). A 2D angle 9 may be defined as the ratio of that portion L of the 
circle’s circumference to the radius r of the circle: 9 = L/r. If the radius of the circle 
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(a) A 2D angle is formed by the radial projection of an object onto a circle, (b) A 3D solid angle is 
formed by the radial projection of an object onto a sphere. 


is 1, then the angle is simply the indicated length of circumference. For example, 
one-quarter of any circle of radius r isolates 9 = (1/4) (27t r)/r = 7t/2 radians. 

The solid angle idea generalizes the direction vector 3 of the previous section into 
a whole range of directions. Thinking of a direction vector as a point on a sphere, 
a differential solid angle indicates a differential region of the sphere. We write a 
differential solid angle as d3. When the region is of finite size, then we have a finite 
solid angle, which in this book is represented by a capital Greek letter, typically A 
and T. 

The directional quantities 3, d3 , and T correspond to the spatial quantities r, dr, 
and V. 

The magnitude of a finite solid angle T is the ratio of some portion S of the 
surface area of a sphere to the squared radius r of the sphere: T = S/r 2 . The unit of 
solid angle is the steradian (abbreviated sr), and since the surface area of a sphere is 
47rr 2 , a full sphere occupies T = 4nr 2 /r 2 = 4n steradians. The radius of the sphere 
used for determining the solid angle is immaterial. To see this, if a is the percentage 
of the surface area of the sphere occupied, then T = a(47rr 2 )/r 2 = a4n steradians, 
so the radius r has dropped out. 

It may be helpful to form an intuitive idea of how much of a sphere is subtended 
by one steradian. The full sphere contains 4n « 12.566 radians. The dodecahedron 
is a Platonic solid with twelve equal faces, each a regular pentagon. So one steradian 
is about equal to the solid angle subtended by one face of a dodecahedron. 

It is often useful to find the solid angle subtended by some object as viewed 
from some point. In this case, the term 5 may be considered the area of the sphere 
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FI G U It I 12.11 

The geometry of a zone. 


intersected by a cone with apex at the center of the sphere, and a cross section 
formed by the silhouette of the object as seen from the sphere’s center, as shown in 
Figure 12.10(b). 

A piece of sphere isolated by a plane is called a zone . A zone is characterized 
by the radius r of the sphere from which it was cut, and its height h. We find h 
by drawing a radius from the center of the original sphere through the center of 
the base of the zone; h is the length of this line contained in the zone, as shown in 
Figure 12.11. The surface area of a zone is given by S = 2nrh. 

The base of a zone is a circle, which we say has radius a, as shown in Figure 12.11. 
When the radius of the sphere is much greater than the size of the zone (that is, 
r » a), we can approximate the area of the zone by the area of this circle. The angle 
subtended by the zone is labeled in Figure 12.11 as a, so a = rsina. The rg^WlS of 
this disk is then 7ra 2 = 7r(rsina) 2 . When a is small, sin a « a, so the solid angle is 
simply 7rg?. 

When the visible surface area of a convex object is small compared to its distance, 
we can often approximate its solid angle by using a simpler geometric representation 
for the object. A useful simplification for convex objects is suggested by the observa¬ 
tion that a convex body with surface area 5, projected onto some random direction, 
will have an average projected area of 5/ 4 [17]. So we can approximate a convex 
object by a disk with radius b = y/S/in^ oriented orthogonally to the direction of 
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MOURI 12.12 


Solid angle approximation, (a) Approximating the area of a zone by a disk, (b) The geometry of 
the disk. 


view, at the same distance d as the object itself. In Figure 12.12(a), an object viewed 
from point P has been replaced by a disk of radius b at point C, oriented so that its 
normal points directly to P. 

When this disk is small compared to the distance (that is, b ^ r), then we can use 
the approximation to the zone area discussed above. 

To find the zone, consider Figure 12.12(b). The radius of the sphere is r = \C-P\. 
From the diagram, 6 = arctan(a/r), and h = r — r cos# = r(l — cos0). Thus the 
actual magnitude of the solid angle is T = 27rr[r(l - cos0)]/r 2 = 27 t( 1 - cos0). So 
we can find the magnitude of the solid angle T of a convex object of surface area S 
at a distance d as 

T = 27t {l — cos [tan -1 (d/r)] } = 2n £l — r/\/(cP + r 2 )J (12.33) 

wherejr = y/S /When 6 is small, a useful approximation for the cosine is 
cos 6 « 1 - 0 2 /2, so we can simplify the solid angle as T « 27r[l - (1 - 0 2 /2)] = 7T0 2 . 
The error of this approximation is within 1% for 6 < 20 degrees. 
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Projected solid angle. 


Just like projected areas, we can find projected solid angles by projecting the solid 
angle onto a plane perpendicular to some direction V, as in Figure 12.13. In this 
book, we write the projected solid angle for the solid angle T projected onto direction 
V as r v , similar to projected area. 

Two interesting properties of the solid angle and its computation are worth noting. 
Figure 12.14 shows that if some object is projected radially onto any surface, then 
the solid angle occupied by that projection is equal to the solid angle of the original 
object. This is useful because it is often more convenient to find the solid angles for 
unusual shapes in two steps, first projecting it onto a simple intermediate surface 
such as a plane, and then projecting the plane to a hemisphere. 

Figure 12.15 shows that the absolute value of the size of the object being projected 
is not the only thing that matters; two different shapes with the same cross section 
as seen from a given point can occupy the same solid angle if they are at appropriate 
distances from the point. 

The notation used for solid angles varies a lot from one field to another, and 
sometimes even within the same field. In particular, in radiometry a capital Greek 
letter such as 17 often stands for the projected solid angle Tcos0 = T v , and a 
lowercase Greek letter is used to indicate any type of solid angle, from the infinitely 
thin direction to finite angles. This variation in notation makes it difficult to recognize 
even identical equations when written by authors in different fields. In this book, 
we will find it important to distinguish between directions and solid angles with 
differential or finite character. Throughout this book we will consistently use arrow- 
accented lowercase Greek letters such as 3 for direction vectors, the notation d3 for 
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PIOURI 12.14 

An object radially projected onto intermediate surfaces. Redrawn from Cohen and Greenberg in 
Computer Graphics (Proc . Siggraph ’85), fig. 4, p. 34. 
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PIOURI 12.15 

Two different objects can occupy the same solid angle if they occupy the same cone. 


differential solid angles, and capital Greek letters such as T for finite solid angles. 
Compare this to the use of r for a point, dr for a differential volume, and V for a 
finite volume. 


12*7*5 Integrating ever Solid Angies 

We will find it important to integrate functions over solid angles. This is accom¬ 
plished just by using a finite solid angle as a domain and a differential solid angle in 
the integral. To show this notation in action, suppose we have a scalar function of 
direction, /(£), and we wish to integrate this over some finite solid angle (or range 
of directions) T, as in Figure 12.16. We will write this as 

p = JJJ 

r 


(12.34) 
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MGURI 12.16 

Integrating a function F(3) over a range I\ 


As with volume integrals, we will usually drop two of the three explicit integral 
signs and leave it to the domain T and differential duj to reveal that there’s a triple 
integration going on, so we will usually see 

F = J f(£5)du (12.35) 

There are three special domains that will prove particularly important to us. 


12.7.6 DlrGCtim Sots 

The first important set is the domain of all possible directions. This is just the 3D 
sphere, for which we use the topologist’s notation S 2 (the 2 in S 2 refers to the 2D 
surface of the sphere; a circle is S l ). 

The other two important special cases arise when we think about the light arriving 
at a surface point. Consider the normal N(s) at point s. Any direction 3 may be 
classified into one of four categories, depending on its relationship to the normal and 
the surface, as illustrated in Figure 12.17. 
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MOURI 12.17 

The directions around a point, (a) Incoming on the front, (b) Outgoing from the front, (c) Incoming 
on the back, (d) Outgoing from the back. 
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The surface normal indicates the positive (or front) side of the surface at P ; all 
points Q in space for which (Q — P) • N > 0 are on the positive (or front) side. All 
vectors arriving at P from the front are gathered together into a hemisphere called 
this notation is intended to represent the hemisphere above a surface, with the 
superscript i representing incoming , as in Figure 12.17(a). The hemisphere of all 
directions leaving the point from the front of the surface is written where the o 
indicates outgoing , as in Figure 12.17(b). 

Similarly, we write the incoming directions arriving on the back of the surface 
as 15*, which is meant to represent the hemisphere below the surface, as in Fig¬ 
ure 12.17(c). Finally, the hemisphere of directions departing the back face is written 
15 0 , illustrated in Figure 12.17(d). The magnitude of each of these hemispherical 
solid angles is 27T. 

We can combine the hemisphere direction sets in sixteen possible ways, as shown 
in Figure 12.18. Notice that the main diagonal contains identity elements and the 
matrix is symmetrical. That means there are only six unique new combinations. 

Of these six combinations, one represents the set of all incoming directions 
(Qi U I5j), one the set of all outgoing directions (f} 0 U15 0 ), and the others mix incom¬ 
ing and outgoing hemispheres. To represent each of these pairs of hemispheres, we 
use the letter 0, with superscripts representing the front hemispheres and subscripts 
representing the back hemispheres. The six combinations may be defined as 

e i0 = fiiUfi o uF 

ej ^fiiUOiUP* 
e i 0 = n i uu 0 up i 
e? = Q 0 u u p° 
e° = n 0 u u 0 u p° 

e i0 = l3 i Ul5 0 UP i (12.36) 


Notice that when both directions are on the same side, we write the combined term 
io. These terms are summarized in Table 12.1. 

Note that if we had defined 0 l ° simply as the union of two of the hemispheres, for 
example, 0 to = U f2 0 > then it would not contain any of the directions in the plane 
perpendicular to the normal. On surfaces such directions may often be ignored, 
but in space they are as important as any other direction. Therefore we adopt the 
convention in Equation 12.36 that each spherical direction set is augmented with a 
plane of directions, either P l representing the incoming directions, or P° representing 
the outgoing directions. We establish the convention of using the sense of the upper 
hemisphere if it is used, or the first subscript on the lower hemisphere. 
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The sixteen combinations of hemisphere sets. 













610 


12 ENERGY TRANSPORT 



Qi 

0 o 

Vi 

Vo 



0 io 

©1 

©o 


0to 

n Q 

©? 

©S 

Vi 

©i 

©? 

Vi 

0i o 

Vo 

©1 

©S 

©io 

Vo 


TA8LI 12.1 

Combining direction hemispheres. 


With this convention, represents the set of all incoming directions to a point, 
and represents the set of all outgoing directions from a point. 

Since there is no surface for a point in space, any convenient vector may be used 
as the “normal” simply to provide orientation. In most expressions the vector being 
used as the normal will appear explicitly. 

For completeness we can define the four degenerate terms by using a single letter 
in the appropriate position: 


©' = fi< 
e° = Q 0 

€>i = Ui 

e 0 = u 0 


(12.37) 


The meaning of the six different combinations of hemispheres may be made 
clearer by Figures 12.19 and 12.20. Here we have reduced each hemisphere to 
a small solid angle. The four combinations show how we can represent the four 
possibilities of where light comes from and where it goes. 

These sets of directions are functions of the point s because they depend on the 
normal there; different points will partition the sphere of directions differently, as in 
Figure 12.21. When we need to indicate this dependence, we will write, for example, 
fii(s) for Qi at point s. 

It’s important to have a good intuitive feeling for these symbols because they 
will crop up frequently, and we will generalize this terminology to refer to different 
measures of light in Chapter 13. 

To lock down the four interpretations of the direction hemispheres given above, 
consider the interaction of a beam of light from the sun arriving at the Earth, as 
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A grid of the four mixed combinations of hemispheres. Upper left: reflection. Upper right: 
transmission. Lower left: forward scattering. Lower right: backward scattering. 
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The two similar combinations of hemispheres, (a) Incoming, (b) Outgoing. 
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PIGURI 12.21 

The orientations of the direction hemispheres depend on the normal at a surface point. 


shown in Figure 12.22. For illustration we’ll assume that the atmosphere can be 
represented by a thin spherical shell around the Earth, and that all normals point 
outward from the Earth’s center. The initial ray in direction u\ arrives from space 
and strikes a particle in the atmosphere at point a, so uj\ G fii(a); that is, it’s incident 
light arriving from outside the surface. If some of the light is reflected back into 
space in direction 1 D 2 , then € ft 0 (a ) 5 since it’s departing light leaving the outside 
of the surface. Some of the light may continue on to the Earth in a new direction £ 3 , 
so £3 G I5 0 (a); that is, it’s departing light leaving from inside the surface. 

Now suppose the light strikes the ground at point g and is reflected into direction 
LJ 4 . From the point of view of g, the incident direction £3 G fii(g) and the reflected 
direction C3 4 G fi 0 (g). 

Finally the light strikes another particle in the atmosphere and is deflected before 
continuing on to space in direction £ 5 . At this intersection point b, the incident 
direction £4 G IS* (b) and the reflected direction £5 G fi 0 (b). 
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FI G U It I 12.22 

A ray of light from the sun arriving at the Earth. 


12.7.7 Particles 

We now turn to characterizing the particles themselves. We will continue to assume 
particles that have the same properties that they had in the rod model. In 3D, these 
become simply the following two: 

1 All particles move with the same speed c. 

2 Particles do not interact. Thus, two particles may pass through each other. 

Any particle satisfying these conditions can be completely described by a pair of 
vectors (r,u;) giving its position and direction of motion (since the speed is always 
the same). This pair of vectors contains five real scalars: (r, uj) = (r x , r y , r 2 , wo, uty). 
We may be prompted to think of a five-dimensional Euclidean space 7£ 5 , in which 
each particle is just a point. A better picture is a Cartesian product space of points 
and directions 1 Z 3 <g>S * 1 2 . This space is called particle phase space , or simply the phase 
space for particles. 

We define the scalar function n(r, J): 7Z 3 (g> S 2 11 to be the number of particles 
at the point (r,u;) in phase space (two particles at the same point in phase space are 
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MOURI 12.23 

Counting the number of particles in the region V 0 T of phase space. 


in the same place, moving in the same direction): 

ti(y,uj) = number of particles at (r,u;) (12.38) 

This function is called the phase space density function . 

We can isolate pieces of phase space by combining a volume V and a range of 
dimensions T, and forming their Cartesian (or direct) product V 0 T, as shown in 
Figure 12.23. We can use the phase space density function to find the number of 
particles in this section of phase space; that is, the number of particles located at any 
point r £ V and traveling in any direction uj € T: 

N(V,T)= [ [ n(r,3) drduj (12.39) 

Jr Jv 

The function N(V, T) is called the particle density in the region V 0 T. 


12.7.8 Flux 

It will be useful to us to define the flux in 3D; it is based on the same idea as in the 
rod model but contains an additional bit of geometry. 
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MOURI 12.24 

(a) Particles flowing over a surface element AS. (b) The end cap of the tube is tilted relative to the 
flow. 


Suppose that we have isolated some piece of surface AS in space, and that there 
is a flow of particles through it, such that the direction flow is perpendicular to the 
surface, as in Figure 12.24(a). The particles all have the same vector velocity c, and 
a density p (that is, there are p particles per unit volume). We want to know the total 
number of particles flowing over the surface per unit time. 

Suppose initially that the velocity is perpendicular to the surface AS. Then in 
the time A£, the particles that cross AS arrive from within a right tube with length 
|c|A£AS. The density of particles in the tube is given by p, so there are p|c|A£AS 
particles in the tube. The rate of flow per time is found by simply dividing by time, 
giving <F, the magnitude of the flux: 


4> = p|c|AS (12.40) 

Suppose now that the surface is tilted with respect to the flow, so the tube 
containing the particles is skewed, as in Figure 12.24(b). If the patch has a unit- 
length surface normal n, then the projected area of the patch is (c • n)AS, so the 
volume of this tube is (c • n)AfAS. As the flow direction c strikes the surface less 
and less head-on, the dot product reduces the size of the tube, until when the flow is 
perpendicular to the surface (and thus nothing flows through the surface), the tube 
volume goes to zero. Again, we can find the magnitude of the flux by multiplying by 
the particle density and dividing through by At: 


= p (c • n)AS 


(12.41) 
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The cosine term in the flux compensates for the enlarged area of a patch as it turns away from the 
flow within a fixed tube, (a) 6 = 0. (b) 6 = 0q. (c) 0 = 4$q. 


It is often useful to treat the flux as a vector quantity: 


= p (c • n)cA5 


(12.42) 


An alternative, more geometrical picture of the origin of the cosine term is given 
in Figure 12.25. The flux in this figure is flowing through a square tube of side s . 
In Figure 12.25(a) we see the end of the tube is perpendicular to the flow; that is, 
c • n = 0 . The area of the end of the tube is s 2 , so if there are n particles passing 
down the tube, the flow per unit area is $o = n/s 2 . 

In Figure 12.25(b) the end of the tube now forms an angle of 6 o with the flow: 
that is, c • n = cos(0o)* The area of the surface through which the particles are 
flowing is now sZ, where s = Zcos(0o)- So the total flow passing through this tube 
per unit area is $ 0 O = n/(sl) = (n/s 2 )cos(6o) = $ocos( 0 o )- 

Figure 12.25(c) shows a more extreme example: now the flux is $ 4 e 0 = n/(sl) = 
(n/s 2 )cos(46o) = $ o cos(40o)- So the cosine term (usually represented by a dot 
product of the flux direction and the normal) is a purely geometric term which we 
introduce so that we are always talking about flow per unit area. When we discuss 
flux over a nonplanar surface, we are effectively projecting the entire surface onto a 
plane perpendicular to the flow, as in Figure 12.26. 

We have assumed that the particles are arriving at AS along a single direction c. 
We can easily allow the particles to arrive within a finite solid angle T. Recalling 
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PIOURI 12.26 

The flux falling on a surface from two different directions. 


that all particles have a common speed c, we can write 

$(AS, T) = J pc(Q • n) AS duj — pcAS J (pj-n)duj (12.43) 

The flux has several useful linearity properties, which may be derived either 
directly from the definition or from experiment [347]. Specifically, the flux is linear 
with respect to both the size of the area AS and the solid angle of the incident flow, 
T, involved in the measurement. 

Figure 12.27(a) shows three solid angles, Ti, T 2 , and Ts. When we run an 
experiment where the sizes of the angles and surfaces are large with respect to the 
wavelength of light, we find 

$(A, Ti) + r 2 ) = r\ + r 2 ) (12.44) 

as shown in Figures 12.27(b) and (c). Similarly, if the solid angle shrinks to zero size, 
so does the flux: $(.4,0) = 0. 

We can state a similar set of principles with respect to the area under considera¬ 
tion. Figure 12.28 shows two different areas, A\ and A 2 . We find from experiment 
that 


$(A U r) + $(A 2 , r) = 9(A X + a 2 , n 


( 12 . 45 ) 
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(b) (c) 
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PIOURI 12.27 

(a) Solid angles Ti, T 2 , and r 3 . (b) Solid angle T\ + T 2 . (c) Solid angle F\ -f T 3 . 



PIOURI 12.28 

(a) Surfaces A\ and A 2 . (b) The surface A\ + A 2 . 


and, as before, $(0, T) = 0. 

Although the flux is derived from the particle density and particle motion, our 
point of view will generally treat the flux as the fundamental description of the 
particle flow. Most of our discussion will be based on terms that are defined with 
respect to the flux. 
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A scattering volume. 


12.8 Scattering in 3D 

We can characterize scattering in 3D by what happens to the particles as they pass 
through a volume, much as we did for the rod model. Presiendorfer [347] care¬ 
fully describes a set of experiments that can be performed to discover some of the 
characteristics of volumetric scattering. We summarize those results here. 

Figure 12.29 shows a scattering volume , denoted X . A scattering volume is 
defined by two directions, <2* and <2 r , and their associated finite solid angles, Ti and 
r r . Of the twelve edges that make up the volume, four are parallel to ufour are 
parallel to c2 r , and the remaining four are perpendicular to the plane spanned by u % 
and 1 2 r . When incident light arriving in direction (2* enters the volume, we say it does 
so through face A ; the light leaving in direction u r exits through face B . Face A (and 
its opposite face) are parallel to d; r ; face B (and its opposite face) are parallel to <2*; 
In general, face A will not be perpendicular to <2*. As in the figure, we can construct 
a face A f which shares one edge with A but is perpendicular to <2*; similarly, we can 
construct a face B f perpendicular to L 2 r . 

We will now summarize the results of three experiments with physical volumes of 
this type. We are interested in the magnitude of the flux $ leaving face B for various 
combinations of incident flux $ arriving on face A . We will assume that the volume 
is large with respect to the wavelength of light, and that it is internally uniform (that 
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FIGURE 12.30 

Splitting the input face into two subfaces. 


is, the properties of any small region of the volume are identical to those of any 
other small region). We will also assume that the volume doesn’t generate any flux 
internally. 

In the first experiment we divide the incident face A into two subfaces, A\ and 
A 2 , which partitions the input flux into two pieces, $,41 and $a 2 -> which induces two 
corresponding output fluxes, $bi and $£ 2 . This is illustrated in Figure 12.30. We 
find from experiment that 

$£1 + $£2 = $£ (12.46) 

In words, this says that the output flux is linear with respect to the area of the input 
flux. 

In the second experiment we divide the input solid angle T into two pieces, T 1 
and T 2 , as in Figure 12.31. Again, each of these solid angles carries its own input 
flux, which generates a corresponding output flux. From experiment, we find 

$ri ■+■ $r 2 = $r (12.47) 

so that the output flux is also linear with respect to the input solid angle. 

Finally, we can consider two entirely different scattering volumes, and write the 
flux that results from two separate input beams. We find once again that the resulting 
flux is linear with respect to the inputs. 
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PIOURI 12.31 

Splitting the input solid angle into two subsolid angles. 


These results can be viewed either as experimental confirmation of the theory, or 
as the phenomenological basis from which the theory is derived. In either case, most 
of modern computer graphics is based on these fundamental results. 

The essence of these properties is that over most light intensities that we deal with 
in practice, most materials are linear : each packet of light is treated independently 
of all other packets of light. So if we double the amount of light projected into a 
volume, we will double the amount that comes out in any given direction. 


1 2.9 Components off 3D Transport 

The next few sections follow the presentations by Arvo [15] and Pomraning [342], 
Recall our five categories of transport in the rod model from Section 12.6. These 
are injection (which we will generalize into the term streaming ), volume emission, 
and the three types of collisions: inscattering, absorption, and outscattering. These 
five categories can also be used to characterize particle transport in 3D, although 
their mathematical expression is more complicated. We will consider each of these 
categories in turn below. 

First, we note two important generalizations that will take us from the rod model 
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MOURI 12.32 

(a) Streaming, (b) Emission, (c) Absorption. 


to 3D. When analyzing the rod model, we isolated a section of the rod and studied 
the flow into and out of that section. In 3D, we instead use a volume V , which may 
have any arbitrary shape. The surface of V will be denoted 5. In the rod model we 
were only concerned with two possible directions in which particles could flow (left 
and right). In 3D, there are an infinite number of directions in which particles can 
flow. We will typically be concerned with a range of directions, denoted I\ 

In this section we will also concentrate more on flux than on particle densities. 
This is because we will ultimately be interested in setting the net change of flux within 
any volume to zero, which is the equilibrium condition of energy, which we assume 
holds when synthesizing images. We will discuss the equilibrium condition in more 
detail shortly. 

In general, our material functions will depend on both position r e V and 
direction Q G T, so they vary for every point and every direction in the subspace 
V <8> T. We will need to integrate over both domains to find the net results of these 
functions. 


13.9.1 Streaming 

We will use the category of streaming to describe the net flow of particles through a 
volume V. We begin by finding the net flow of particles through the surface 5 of V. 
Because particles can change direction as they move through the volume, we need to 
account for their changing directions, as in Figure 12.32(a). 
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We can find this flow easily. Recall the definition of flux from Equation 12.42, 
which tells us how to find the flux over any small patch AS . We can replace the 
surface S with a polyhedron with many small faces A Si, and then sum together 
the contributions from each face. The limiting result of this process is the surface 
integral [378]: 

$ s = f <t>{dS)dS (12.48) 

Js 

To count all the particles passing through the surface, we need to integrate Equa¬ 
tion 12.48 over all the directions T in which we are interested, giving us the total 
flux $> s due to streaming: 


$ s = J j $(s ,u)dSdC3 (12.49) 

This quantity will have a positive value when there is a net loss of particles 
through the volume. Note that does not measure the total number of particles 
through the volume, but instead cancels those exiting with those arriving, leaving us 
with a net flow. 


12.9.2 Emission 

To count up the particles emitted inside \7, we will posit a volumetric emission flux 
e(r ,uj). This tells us how many particles are emitted per unit time from point r in 
the direction cU, as shown in Figure 12.32(b). 

To find the total flux radiated from the volume in some set of directions due 
to emission, we can simply integrate this function over all points r € V and all 
directions Q € T: 

$ e = J J e(r,u;)drduj (12.50) 


12.9.3 Absorption 

Similarly to emission, we will assume an absorption function cr a (r,u;). This is a 
scalar value, telling us the expected percentage of the flux that will be absorbed as it 
passes through r in direction c3, as illustrated in Figure 12.32(c). 

To find the total absorption in the volume, we simply integrate this percentage 
weighted by the flux at that point. This gives us an absorbed flux , which is the flux 
that is removed by the material: 

$ a = a a {r,v)$(r,uj)drduj 
Jr Jv 


(12.51) 
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MOURI 12.33 

Outscattering of particles from the beam, (a) Outscatter where the outgoing direction (S' 0 T. 
(b) Outscatter where the outgoing direction CS' 6 T. 


12.9.4 Outscattering 

Some particles in the stream may interact with the material, but rather than be 
absorbed, they are deflected and continue on in some new direction. In our case, we 
are interested in determining the number of particles at a point r that are deflected 
from their motion along direction CS into some other direction CS\ as shown in 
Figure 12.33. 

As with absorption, we can express this probability with a scalar value that 
indicates how much of the flux we expect to be scattered this way. The volume 
outscattering probability function /c(r,cD —> CS') specifies the probability that a par¬ 
ticle at r traveling in direction CS will be scattered into any direction CS' per unit 
distance per unit solid angle. Note that CS' can take on the value of CS. The arrow in 
this notation is useful because it shows the direction of scattering explicitly. 

There are at least two ways to specify how a material outscatters: we can measure 
how many particles that are originally traveling in a direction CS e T exit in some 
other direction CS' & T, or we can simply compute the number of particles originally 
in T that are scattered in any direction at all. We will select the latter approach. This 
means that we will include in our count of scattered particles those that are traveling 
in directions within T both before and after scattering, as in Figure 12.33(b). We 
will find that this method of accounting works well when combined with our means 
of counting inscattered particles. In this figure the dashed cone indicates the same 
directions except with the arrowheads at the scattering event. 

With these conditions, the outscattered flux $ 0 is found by integrating over all 
incident directions CS e T at all points r G V , and by counting the number of particles 
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MOURI 12.34 

AAAAAAA /WWSAAA/WWW 

Inscattering deflects a particle into the range I\ (a) Inscatter where the incoming direction uj' £ T. 
(b) Inscatter where the incoming direction uj ' G T. 


deflected into any new direction uj' G S 2 : 

$o = f f [ k(t,uj -> uj')$(r,uj)duj'drduj (12.52) 

Jr Jv Js 2 

Note that the incident flux <F(r,u>) is independent of c o' in the innermost integral. 
We can then pull it out of that integral and write 

3> 0 = [ [ $(r ,uj) [ n(r, uj uj') duj' dr duj (12.53) 

Jr Jv Js 2 


12.9.5 Inscottoring 

The category of inscattering is closely related to outscattering: it describes when a 
particle originally arriving at a point r along any direction uj' is deflected into a new 
direction uj e T, as in Figure 12.34. As with outscattering, we describe the probability 
of a scattering event occurring with the volume inscattering probability function. 
This is also denoted but the direction of scattering is reversed: «(r, uj' —> uj). 

Note that if uj' e T, then we are counting those particles that are both arriving 
and departing along a direction in T, just as for outscattering. 

So the inscattered flux is found by integrating over all incident directions 
uj' G S 2 at all points r G V, and by counting the number of particles deflected into 
any direction uj G T: 




‘III 

Jr Jv Js 2 


k( r, uj' —► d;)4>(r, uj') duj' dr duj 


(12.54) 
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Notice that we use the flux coming from uj' rather than uj as in the outscattering case. 


12*9.6 A Complete Transport Model 

Now that we have an expression for each of the phenomena we want to represent, 
we are almost ready to combine them into a transport model. What’s left are the 
mechanics of this combination. 

The essential observation needed to carry out this combination is that the typical 
image-synthesis problem addresses a system in equilibrium . That is, we assume that 
the distribution of light in the environment is steady and constant (at least over the 
time of exposure of the image). This equilibrium usually comes about very quickly 
because of the great speed of light with respect to the simulated interval over which 
we estimate the distribution of light in an environment. For example, when we 
turn on a flashlight in a dark room, the energy from the flashlight first strikes some 
surfaces, which then reradiate some of the light to other surfaces, and these may in 
turn reradiate some energy back to the first surfaces. This chain of illumination and 
reradiation may go very deep, but eventually it settles down into a steady solution 
where the light energy distribution over time in the room becomes constant. This is 
obvious to us when we hold the light steady and look around the room: the objects 
do not periodically grow brighter and dimmer as time goes on. The illumination in 
the room is then said to be in a steady state, or equilibrium. 

This equilibrium condition implies that the change in the flux at every point and 
every direction in the scene is constant. That is, the time derivative of the flux is 
zero. In symbols, 

W=° (12.55) 

This does not mean that there is no flow of energy in the scene. Rather, this 
equilibrium condition says that there is an active or dynamic equilibrium, in which 
the flow is constant. We can think of this condition as telling us that the number of 
particles flowing past any piece of surface (imaginary or real) in the scene per unit 
time is always constant. 

Because this condition holds everywhere in the scene, it also holds over every 
volume V and every set of directions T. Therefore we know that the total gains and 
losses in our arbitrary volume must total to 0. In symbols, then, we can say that the 
sum of the two gains ($ e due to emission and due to inscattering) must equal the 
sum of the three losses ($ s due to streaming, $ 0 due to outscattering, and $ a due to 
absorption). 

In symbols, 

$ e + $ i = $ 5 + <l> 0 + $ a (12.56) 

Recall that both 3> 0 and 3>j contain the particles both arriving and departing in 
direction within T. Because they are both positive and on opposite sides of Equa- 
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tion 12.56, these common terms cancel out, leaving us with the net difference, so we 
truly get the net inscatter and outscatter distributions. 

We can now simply plug in our results from above to flesh out this expression: 


/,/. e(r ,u) drdu 
+ 

III /v(r, uj' u;)<I>(r, uj') duj' dr du 

Jr Jv Js 2 


/./, 3>(s,d;) dSdui 
+ 

— If <J>(r,u;) / «(r, uj —► uj f ) du)' dr du 

Jr Jv Js 2 


Jr Jv Js 2 f f 

/ / a a (r,u;)$(r,u)drduj 
Jr Jv 

(12.57) 

This equation is much too hard to attempt to solve directly. But notice that 
four out of five terms in Equation 12.57 have an outermost double integral over the 
volume V and directions T; only the streaming term has a different form. We will 
find that getting everything into the same form will greatly simplify this expression. 

Happily, it is easy to convert the streaming term to the same form as the others. 
Recall the divergence theorem (also known as Gauss’s theorem), which states that 
for a vector function F and a volume V with surface 5, and unit surface normal n 
at every point, then 

[ F n dS= f V F dV (12.58) 

Js Jv 

where V is the del operator , which in rectilinear Euclidean space is 


v = , ei + % +k Tz 


(12.59) 


This tells us that the integral of the normal component of the function over the 
surface is equal to the integral of the divergence of the function throughout the 
volume. A very readable derivation and discussion of the divergence theorem may 
be found in Schey [378]. 

We can use this theorem to rewrite in a form that is equivalent to the other 
terms. First we recall the streaming flux from Equation 12.49. Consider only the 
inner integral over 5, and expand the flux: 

[ $(s ,(3)dS = [ n(r,u))c{r)-n(dS)dS (12.60) 

Js Js 

where n(uj, r) is the particle density. From this, we can identify the vector function 
F in the divergence theorem as F(d;,r) = n(u;,r)c(r). Then we can rewrite this 
integral as 

f n(r,J)c(r) • n(dS) dS = f V • [ra(r,d;)c(r)] dV 
Js Jv 
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V 


V 


uj • V$(r, uj) dV 


c(r) • Vn(r,ul) dV 


(12.61) 


Notice that we have turned a divergence calculation into a gradient, which is then 
turned into an inner product with respect to the direction of flow. 

We can now wrap this result back into the outer integral over T to find 



uj • V$(r,a;) dV du 


(12.62) 


With this change, we can now rewrite Equation 12.57 so that the outermost two 
integrals on each term are the same: 



+ 

<j a (r,u;)^>(r,u;) drdu; 

(12.63) 


The advantage of this form comes when we use the full power of our earlier 
definition of equilibrium. Recall that when the environment is in equilibrium, there 
is zero net flux for every volume V and set of directions T. That means our choice 
of V and T in these derivations don’t matter; any choices will do. In other words, 
we can pick any volume V we want, and any set of directions F we want, and we 
will find that the net flow of particles through the combined phase-space volume 
defined by these two domains is constant over time. Therefore it must be the case 
that the equality in Equation 12.63 holds when we replace the integrals just by their 
integrands. Stripping off the outer two integrals from each term then gives us the 
standard one-speed particle transport equation : 



Equation 12.64 is known in the physics literature as a Boltzmann equation [431]. It 
tells us that there is a balance of gains and losses at every point in the environment. It 
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MOURI 12.35 

An isotropic medium; only the angle between the incident and scattered directions matters, not 
their absolute location. 


also gives us a condition on the flux <F(r, uj); the flux must be such that Equation 12.64 
holds. We say that such a flux is a solution of Equation 12.64. 


12.9.7 Isotropic Materials 

We can simplify Equation 12.64 a bit when the material is isotropic. In particular, 
consider the outscattering term <F 0 . 

In an isotropic material, the direction of the incident and scattered directions u 
and uj' don’t matter; only the angle between them makes a difference. One way to 
think of this is to imagine two rigid straight sticks, cemented together at one end 
in fixed position. Label one stick CS and the other d;'; no matter how the sticks are 
moved or rotated in the medium, if it is isotropic, then «(r, CS -4 CS f ) is a constant. 
See Figure 12.35. We can indicate this explicitly by writing the function so that it 
depends only on the dot product between the two vectors: k(t,CS • CS'). 

For an isotropic material we can write as 

<£ 0 = <$(r, tD) / k(t,CS - CS') dCS' 

Js 2 

= $(r,£)a io (r) 


(12.65) 
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where the isotropic outscattering coefficient cr* 0 (r) is 

<T io ( r) = / K(r, ujo • oJ') cK' 

Js 2 


( 12 . 66 ) 


for any choice of Jo G S 2 . 

We can now roll together the isotropic outscattering coefficient a io (r) and the ab¬ 
sorption coefficient <7 0 (r) into the isotropic outscattering and absorption coefficient 


<7ioa( r) = (Tio(r) + a a (r) 
and write Equation 12.64 in a simpler form: 


(12.67) 



( 12 . 68 ) 


Equation 12.68 is the standard isotropic one-speed particle transport equation , 
also called the standard stationary one-speed particle transport equation. 

12.10 Boundary Conditions 

The major result of the previous section was Equation 12.64. Because it contains 
the del operator V, it is at least partly a differential equation. We will concentrate 
on that nature of the equation for the moment. 

We know that all differential equations require boundary conditions in order 
to be fully specified. In this case, the transport equation describes the transfer of 
particles through space, so the boundary conditions describe what happens where 
space “stops”; that is, the surfaces of objects. So we can interpret our boundary 
conditions as simply a description of what happens to the flux at the surfaces of 
objects. This boils down to describing how surfaces will reflect (or transmit) the 
particles striking them. 

Following Arvo [15], we will introduce some notation that will help us discuss 
surfaces and what happens to the flux striking them. 

Recall that at each surface point s, the local tangent plane splits the sphere of 
directions into two hemispherical solid angles. We can think of tt 0 ( s) as representing 
all the directions in which particles can leave the front of the surface from s, and 
Hi(s) as containing all directions along which particles can arrive at the outside of 
s. Our job in finding boundary conditions is to describe the nature of the particles 
returned to the medium through the directions f2 0 (s) as a function of the particles 
arriving from the directions f^(s). 
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To facilitate that task, we will generalize the notation to take into account all the 
points on all the surfaces at once. To motivate this notation, imagine an environment 
containing exactly three points: si, s 2 , and S3. Then we could form the union of the 
hemispheres of directions leaving the front, to build a subset of 72 3 0 S 2 : 

(si 0 ft 0 (si)) U (s 2 0 fi 0 ( S 2 )) U (s 3 0 n o (ss)) (12.69) 

This idea generalizes nicely into the situation where we have continuous surfaces 
made of an infinite number of points. We simply form the collection from all pairs 
(s, uj) that give a point in space and a direction. 

Combining all points with the front outgoing directions Q 0 gives us the set V° 9 
where the superscript o stands for outgoing . Similarly, we get V % 9 V 0 , and Vi by 
combining the set of surface points with the hemispherical sets Qj, U 0 , and Uii 

V° — {(s, uj) G 72 0 S 2 \ s G M, uj G 
V* — {(s,cj) G 72 0<S 2 : s G A d,uj G 
Vo = {(s,w) € H®S 2 : s € M,i2 € U 0 } 

P i = {(s,u;)e^®5 2 :s€M, l 5€U i } 02,2ft) 

So V 1 is the collection of all direction vectors that arrive at a point on M on 
the same side as the surface normal at that point, and V% is the set of all direction 
vectors that arrive at a point on the side opposite the normal. This interpretation is 
illustrated in Figure 12.36. 

We can form the sixteen combinations of these sets just as we did for the hemi¬ 
spherical sets, producing four mixed-direction sets as before. We define them fol¬ 
lowing the same pattern: 


V io = {(s,(2) € n®S 2 : s € M,<3 £ 9 io } 

VI = {(s, u;) 6 K ® S 2 : s € M,lj € O^,} 

V° = {(s,u5) 6 K®S 2 : s e M,u £ Of} 

~Pio = {(s, w) £ 1Z <gi S 2 : S € M,<jj € 0j o } (UZU 

The two pure incoming and outgoing sets are defined similarly: 

K = {(s,5)6K®5 2 :s6M,w€^} 

VZ={(s,u)€Tl®S 2 :s€M,i2ee° 0 } (JU2) 

Our job of specifying boundary conditions now boils down to describing the flux, 
leaving every point on every surface in terms of the incident flux. In other words, we 
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(a) The set V X Q . (b) The set V f. 


want to describe the flux <£ for all elements of in terms only of <£ for all elements 
of'P- : ),E), where E encompasses surface emission. The particular 

choice of / provides the boundary conditions at that point. 

There are many ways to specify these boundary conditions, depending on the type 
of problem at hand and the information that has been given to us. These include the 
following: 
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PIGURI 1 2.37 

Different boundary conditions, (a) Free, (b) Periodic, (c) Explicit, (d) Implicit. 


Free: When particles can exit the system at a surface but not reenter it, this is called a 
free boundary condition. An example of a free boundary in computer graphics 
comes from the common practice of surrounding every environment with a 
large enclosing object, often a perfectly absorbing black sphere. Any energy 
striking this sphere is simply removed from the system, as in Figure 12.37(a). 

Periodic: When a domain is periodic, then the boundary conditions share that prop¬ 
erty. Periodic conditions are usually involved when several distinct locations in 
a parameter space map to the same single location in physical space. Periodic 
conditions can be expressed by constraints like 

$(s,(3) = *(s',u5) (12.73) 

for two different points s ^ s'. 

For example, one model of the cosmological universe posits that it is closed : 
the universe wraps back upon itself, so that if you go far enough in any 
direction you’ll end up back where you started. A more accessible example 
is circular travel around the central axis of a right circular cylinder, as in 
Figure 12.37(b). Here the two ends of the parameter space (0 = 0 and 6 = 2n) 
map to the same location in physical space. So <£(s + fcv, u;) = $(s,u;) where 
k € Z and v represents a full turn around the cylinder. 

Explicit: When the flux leaving a point on a surface is independent of the incident 
flux, we say that the outgoing flux is specified explicitly . This can be expressed 
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with an explicit function: 


$(s,3) = e 8 (s,3) (12.74) 

where e s is a surface emission function . Notice that e s is independent of the 
incident flux. 

A common use of explicit boundary conditions is to describe a luminaire , 
which emits a fixed pattern of light regardless of what light is falling upon it, 
as in Figure 12.37(c). 

Implicit: When the flux leaving a point on a surface depends on the incident flux, 
then we specify it with implicit or reflecting boundary conditions, as in Fig¬ 
ure 12.37(d). This is by far the most common type in computer graphics, 
accounting for most physical surfaces (this class includes transmission of en¬ 
ergy through transparent surfaces). 

The most general description of an implicit boundary involves an arbitrary 
function f s applied to the incident flux to determine the outgoing flux: 

*(s,<3) = /,(*( s,uJ')) 

Of € 0* (12.75) 

where f s () is a surface-scattering function. The function f s has access to all 
incident directions u 

Such a function is more general than we need to simulate real materials. Recall 
that Section 12.8 showed that volume scattering is linear. Over a very wide 
range, surface scattering is also linear. Thus the contribution of the incident 
flux $(s,3i) at point s, along direction 3\ € 15* (s), is independent of the 
incident flux along 0>2 £ l$i(s) for d; 2 ^ 3\. This suggests that 

the surface scattering function f s above can be rewritten to simply scale the 
incident flux in each direction, and then sum the scaled fluxes. The amount by 
which some incident direction 3' is scaled depends only on s and the outgoing 
direction 3. We can write this as 

4>(s,u;) = f k s (s,3' -* u;)$(s, 3') d3* (12.76) 

Jv x ( 8 ) 

where k 8 is the surface-scattering distribution function . This gives us the 
amount contributed to the outgoing flux by the flux in each incident direction 
3 f e Ui(s). In general, this function can contain distribution functions like 
the delta function. 
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In a physical situation, energy must be conserved: that is, no more energy 
may be sent into ©°(s) than arrives along 0-(s), and only positive amounts of 
energy may be radiated: 



K S (S, Ld' —>■ Uj) duj' 


< 


1 


k; 5 (s, uJ —¥ u) > 0 


v(s,£)e©» 

V(s,£)G©» 


(12.77) 


Mixed: The most general boundary conditions are a combination of implicit and 
explicit; in graphics, this means combining how a surface radiates light with 
how it reflects it. We can write the boundary conditions at a point s G M as 

$(s,u;) = c s (s,uj) + f k s (s,cj' —► d;)$(s,d;') du' (12.78) 

•/©;(■) 


for all (s,o;) G T~L + . 

When boundary conditions are added to the transport equations 12.64 and 12.68, 
the result is a complete and well-defined specification for the flux 4>(s,u;) for every 
point and direction in the environment. The whole point of most rendering algo¬ 
rithms is to find the function 4> that satisfies that specification. 


12.11 The Integral Ferm 

The way we have stated the general transport equation in Equation 12.64 is perfectly 
valid but inconvenient. Because it contains both an integral and derivative of the 
unknown quantity $, it is called an ordinary integro-differential equation . Although 
solution techniques for integro-differential equations exist [120,173], the theory for 
solving this type of equation is not nearly as well developed as the theories for solving 
partial differential equations [60] and integral equations [251]. Although integral 
equations are not typically introduced in school to the same extent as differential 
equations, they have been well studied for a long time [324,456], and solution tech¬ 
niques for integral equations now appear in standard reference works on numerical 
methods [348]. 

To make use of the powerful solution methods for differential or integral equa¬ 
tions, we need to convert our transport equation into one of these forms. The usual 
approach in transport theory is to recast the equation as an integral equation [129], 
and that is the approach we take here. One of the advantages of the integral equation 
form is that the boundary conditions are rolled right into the equation, rather than 
being maintained as a separate set of constraints. 
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12.11.1 An Example 

The process of converting Equation 12.64 into an integral equation requires several 
steps. These steps involve some new notation and ideas, and you could easily get 
lost in the clutter. To help keep the forest in sight when walking through the trees, 
we begin with a simple example that presents the basic steps. 

Consider the following integro-differential equation involving an unknown func¬ 
tion /(£): 


dm 

dt 


-xf K(t, dfi = g(t) 


(12.79) 


where the constraint g(t) and the function K(t,a) are given. When an integral of 
the type f K(t , p)f(p) dp is used to find /, the function K is called the kernel of the 
integral. 

Equation 12.79 contains many of the key features of Equation 12.64, though in 
a simpler and more abstract form. When g(t) = 0, this type of equation is said to 
be homogeneous ; otherwise it is nonhomogeneous. The function g(t) is variously 
called the driving function , the forcing function , or the input function in the signal 
processing and differential equations literatures [60,151]. 

Suppose we are also given the initial condition f(a) = / a . Integrating both sides 
with respect to t yields 


m- a m: K{u,[i)dv^ f (n) dy, = J g(n)dn + f a 

f(t) -\f K l (t,n)f(v)dii = G(t) + f a (12.80) 

J a 


where we have simply collected and renamed some terms in the second line. We now 
have a completely equivalent representation for the original equation, including the 
boundary condition. 

Because Equation 12.80 expresses the function f(t) only in terms of itself and 
integrals containing it, an equation of this form is called an integral equation . The 
advantage of this transformation is that we can now use the sophisticated tools of 
integral equation theory to find f(t ). 

We will now take this idea and apply it to Equation 12.64 in order to transform 
it from an integro-differential equation to an integral equation. Most rendering 
theory is devoted to describing how the theory of integral equations may be used to 
understand that equation, and the design of algorithms to efficiently solve it for the 
function 4>, representing the flow of light in a scene. 
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12.11.2 The Integral Pern el the Transport Iquatien 

This section closely follows the derivations presented by Arvo [15] and Pomraning 
[342]. 

We start by recapitulating the basic transport equation from Equation 12.64: 

e(r,u;)+ / K(r,d5' —► d;)$(r,d5) du) f 

Js 2 

= uj • V$(r,d;) + $(r,d;) / «(r, uj uj 1 ) duj f + cr a (r, d;)$(r, uj) (12.81) 

Js 2 

which holds for all (r,d;) G (1Z 3 - M) ® S 2 . 

We remove the differential nature of this equation by converting it to a simple 
differential equation, and then integrating as in the example in the previous section. 

The first step in simplifying Equation 12.81 is to get rid of the del operator and 
turn the streaming term into a simple derivative. We recall from the definition of 
the gradient that uj • Vf for any scalar field f is the directional derivative of f in 
the direction of uj, since we’re projecting the gradient into uj [377]. In other words, 
uj • V$(r, uj) is just the change in $ along the direction uj . The trick is to notice that 
we can specify various points along this direction with the expression r + ad;, so 
that a = 0 specifies r, and q^O specifies points along the line parallel to uj passing 
through r. The directional derivative of $ onto uj is the derivative of $ along this 
line with respect to a: 


Q 

uj • V<J>(r, uj) = — $(r + auj , uj) 
oa 


a=0 


(12.82) 


Figure 12.38 shows an abstract representation of this operation. Because on the 
page we can only draw $ as a function of two variables, this figure shows $>(r,d5), 
where r is a distance. 

Looking more closely at $ around r, we can draw two vectors along the line d;, 
given by A = r + uj and B = r — d;, as in Figure 12.39. The vectors A and B are 
antiparallel, so 


d 


A - -B 

d 


—$(r + auj,uj) = ——$(r - au,uj) 
oa oa 


(12.83) 


This is useful because we would like to look backward along uj and see where the 
flux is coming from. 

The advantage of this approach is that as we look backward from r along —d;, we 
are simply looking at flux traveling through space. We have just spent a good deal 
of time describing the scattering, emitting, and absorbing events that can happen as 
a particle moves through space, so we have a good idea of what’s happening to the 
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FIOURI 12.38 

The function <£(r + oluj,Q) as a function of radius r and direction u;. 



FIGURE 12.38 

The vectors A and B are antiparallel. 
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flux at each point along its path along Q from an arbitrary point until it eventually 
reaches r. 

The arbitrary point referred to in the previous sentence will lie either within 
another bit of volume, or on a surface. If it’s on a surface, then we get the flux 
leaving that point on the surface from the boundary conditions. 

This is the essential step in the transformation of Equation 12.64 from an integro- 
differential equation to an integral equation: by looking far enough backward along 
the incident direction, we eventually find a surface, and we can find the flux at that 
point from the boundary conditions. We now turn to phrasing that statement in 
mathematical terms. 

We will need to introduce some new notation, or expressions like Equation 12.64 
will start to look simple by comparison. 

First we will gather together the positive contributions (or gains) to the flux as it 
flows through space. We create the gain function G(r,a3), which simply combines 
volumetric emission and inscattering: 

G(r,tD) = e(r,u;) + [ k(t,uj' -* d;)<I>(r,a; / ) da;' (12.84) 

Js 2 

We will now introduce a whole family of functions derived from those we’ve seen 
by placing a hat over them. For any function /(r,d;), the notation f(a) refers to 
/(r - aij,ij), so this tells us the value of the function as we walk backward along 
the direction -uj from the point r [15]. 

Using the notation described above, we can write the basic transport equation 
more succinctly as 


— ^-$(a) + $(a) [ n(r-aa,Q^Cj')da' = G(a) (12.85) 

oa J S 2 

Equation 12.85 looks something like an ordinary differential equation for <F(a), 
but it has two features that seem to prevent it from actually being one. The first is 
that G seems to depend on <E>, even though the notation hides that dependency. It is 
true that G does depend on $ in general, but 3>(a) is what we’re solving for here, 
and that’s an infinitely thin line. In terms of measure theory, $ contributes a set of 
measure zero to the integral, and therefore may be safely ignored [425]. 

The other problem is the unpleasant integral on the left-hand side. But this 
integral is just a constant with respect to our unknown $(a), so we will write it 
simply as <r c (a): 



k(t — aid, uj —» J) did' = 4 >( a ) a c ( a ) 


( 12 . 86 ) 


Multiplying Equation 12.85 by -1 on both sides and substituting the above, we 
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have 

r\ 

—$(a) - *(a)d c (a) = -G(a) (12.87) 

oa 

This is our first result in this section: it is a linear, first-order differential equation in 
$ with variable coefficients, and even has a first-order coefficient of 1. 

This type of equation is easily solved. We digress for a moment to solve it in 
the general case, and then return to the specifics of Equation 12.87. Consider the 
following differential equation for y{x) 9 where a derivative is indicated by a prime, 
as in y'(x): 

y\x) + p(x)y(x) = g(x) (12.88) 

We would like to find y(x). The trick is to look for a function p(x) such that when 
we multiply the left side by p(x), the derivative of this product [p(x)y(x)] f is equal 
to the existing left-hand side. Such a function is called an integrating factor [60]. 
Suppose for a moment that we had such a function. Then we can simply multiply 
everything out and see what p(x) would have to be to satisfy these requirements: 

H(x)[y’(x) + p(x)y(x)] = [p{x)y{x)]’ 
p(x)y’(x) + p{x)p(x)y(x) = p(x)y’(x) + p!{x)y(x) (12.89) 


If we now assume that p(x) ^ 0, 

p'{x) 


= p(x) 


p(x) 

MM*)] = / p(t) dr 
Jo 

li(x) = exp | J p(r) dr 


(12.90) 


Note that p(x) > 0 for all x , so our assumption is fulfilled. 

We now return to our problem of solving Equation 12.87. This matches the form 
of Equation 12.88, with p(a) = -<r(a) 0 . Therefore the integrating factor is 


p(a) = exp 


\f- s < 


(r) dr 


(12.91) 


Using Equation 12.89 to change the left-hand side, Equation 12.87 becomes 


d_ 

da 


|//(a)$(a)] = -p(a)G{a) 




-p(r)G(r)dr 



12.11 The Integrol Form 


641 


/u(t)$(t) 


T=Q 

T=0 


f 


l i(t)G(t) dr 


(12.92) 


Solving for 4>(0) we find 


$(0) = n(a)$(a) + 



fi(r)G(r)dr 


(12.93) 


Equation 12.93 tells us how to find the flux $(0) = $(r,u;) in terms of the flux 
arriving along the direction u;. As we mentioned before, the purpose of this approach 
is that we can now look backward along —u; to find a point s G A/; that is, a point 
on a surface. We use the boundary conditions to find the flux at s directed along u;, 
and then adjust the flux as it travels toward r. 

To find this point s, we will create and use the visible-surface function (or the 
nearest-surface function or ray-tracing function) i/(r,u5):72 3 <g> S 2 ^ 11. This takes 
any point r and searches along — u until it finds the nearest point on any surface Mi. 
This function is the primary object of interest in a ray-tracing program because it is 
often a very slow operation. But theoretically we have no problem simply defining 


i/(r,d5) = inf {a > 0 : r — oluj G M} (12.94) 

So v(r, u;) is a scalar function that returns the distance from a point r to the nearest 
point s G M. This point is actually given by simply subtracting a vector of length 
au) from r: 

s = r - v(t,uj)uj (12.95) 

This function is illustrated in Figure 12.40. 

Now that we have a way to identify the surface point s from which flux travels to 
r along u;, we need to find a description of that flux. This is exactly what is provided 
by the boundary conditions from Equation 12.78: 

4>(s,d5) = e s (s,uj) + / k s (s,u' u)$(s ,uj')du' (12.96) 

Je\(s) 


for all (s,u;) G H*. 

Combining this with Equation 12.93 and expanding out the “hat” functions, we 
find 

$(r, d;) = jz(r, s)$(s, d5) -|- / n(r, a)G(a, Q) da 

Jo 

where a = r — ad;, and \u\ = 1. 


(12.97) 
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MOURI 12.40 

The visible-surface function. 


Expanding out G to see the explicit dependence on we arrive at our goal in 
this section: the integral form of the transport equation : 


$(r.u3) — /i(r,s)$(s,t3) 

+ J j^t(r,a) ^e(a,uf) + j k( r,u5' 





da 


(12.98) 


where 


a = r — auj 
h = v(t, uj) 
s = r — hiu 


r /*lk-s|| 

fi(r, s) = exply cr c (r — dr 

d c (r,w)= / /c(r, Co —» lj') duj f 

Js 2 


aim 


This equation is a form of a more general relationship called the Boltzmann equation , 
and indeed Equation 12.98 may be derived from the Boltzmann equation along with 
the boundary conditions [116]. 
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In words, Equation 12.98 says that the flux at a point r, arriving along a direction 
a;, is the sum of two components. The first component is found by searching 
backward along —u until we find the nearest point s on any surface. We find the 
fluxJTj(s, c j) leaving that point along u;, and adjust its magnitude by the outscattering, 
emission, and absorption that happen to it along the way from s to r. The other 
component is found by looking at every point r — au on the line between s (where 
a = h) and r (where a = 0). At each point, we combine the volumetric emission at 
that point with any inscattered flux, and then we adjust the magnitude of that flux 
by the volumetric effects of outscattering, emission, and absorption as it travels from 
that point to r. An excellent discussion of the assumptions inherent in any form of 
transport equation, and this equation in particular, may be found in Pomraning’s 
text [342, pp. 47-49]. 

This equation really says nothing more nor less than Equation 12.64. The me¬ 
chanical differences are two: 4> is expressed only in terms of itself and integrals 
containing it (there are no derivatives containing $), and the boundary conditions 
are contained in the equation itself, rather than as auxiliary constraints. In either 
case, our goal in image synthesis is to find the function <3> that satisfies the equa¬ 
tion and boundary conditions, for that describes the distribution of light in a scene, 
which is what we need to know to render a picture. In fact, rendering algorithms 
are nothing but various methods to find approximate descriptions of $. 

As we mentioned earlier, the integral form of the transport equation is valuable 
to us because it connects rendering with the deep and well-studied field of integral 
equations. Stating the transport equation, and establishing its connection to integral 
equation theory, were the main purposes of this chapter. 


12.12 The Light Transport Equation 

We have so far computed only “flux,” and except for illustrative examples we have 
not tied down that flux to any particular interpretation. All of the theory in this 
chapter up to now is equally applicable to any phenomenon that can be modeled 
by particles that all travel at the same speed, and do not interact with each other. 
Not too many physical problems satisfy the latter criterion, but if we assume that 
collisions are very unlikely, then this model can simulate problems from automobile 
traffic to heat flow, and plasma dynamics to light transport. The last example is of 
course our primary interest in computer graphics. 

All we need in order to tailor the transport equation to computer graphics is to 
replace the abstract notion of “particle” with the concrete notion of a “photon,” 
and replace the flux with a more convenient term. 

That term is the radiance , a unit of measurement of light energy. In the next 
chapter we will discuss the field of radiometry , which lays out the definitions for 
various measurements of light. We will then have the vocabulary to return to the 
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transport equation and express it in terms that will be directly useful for image 
synthesis. 


1 2 .13 Further Reading 

Transport theory is essentially a heuristic theory based on common sense and some 
simple physical observations. The most important of these are the linearity of surface 
and physical scattering, which are discussed at length by Preisendorfer in his book 
[347]. He offers an excellent development of the subject in physical terms. An 
excellent discussion of transport theory for graphics has been presented by Arvo 
[15]; much of this chapter follows that presentation. Another discussion of the 
meaning of the transport method is presented by Pomraning in his book [342]. 

Our rod model followed the presentation in Wing’s book [482], which then goes 
on to develop the method of invariant embedding . This is an alternative route 
to finding the transport equations that has much to offer in terms of conceptual 
elegance. 

Other discussions of transport theory may be found in the fundamental books by 
Duderstadt and Martin [129], Preisendorfer [347], Tait [431], Wing [482], Williams 
[481], and Case and Zweifel [75]. Computational methods are discussed in detail 
by Lenoble [266]. Scattering in particular is discussed in detail in Williams [481]. 


12.14 Ixercises 
ImtcIm 12.1 

Derive the left-moving particle flux given by Equation 12.5. 

iMTCiM 12.2 

(Use a symbolic math program for this exercise.) If we assume constants for all the 
material and surface functions in Equations 12.27 and 12.28, then they admit an 
analytic solution. 

(a) Replace all the material functions in Equations 12.27 and 12.28 with con¬ 
stants. Find closed-form expressions for $(x, L) and $(x, R) (don’t worry 
about boundary conditions). 

(b) Choose reasonable values for these constants and plot the fluxes as functions 
of x. 

bGrcU* 12.2 

We chose to define inscatter and outscatter by including all directions into which a 
particle might be scattered. An alternative is to restrict the domains of integration 
to only those directions not in T. Rewrite Equations 12.52 and 12.54 to express this 
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other interpretation. Discuss whether this formulation has any advantage over the 
form used in the text. 

Ixtrcist 12*4 

Discuss the physical interpretation of Equations 12.18 and 12.19 for a = n/2a. 

ixtrcift 12.5 

Add an absorption probability a a to Equations 12.18 and 12.19. What is the critical 
value of a as a function of v a ? 

Ixtrcift 12.4 

Equation 12.56 contains a balance between the inscattering term and the outscat- 
tering term <£ 0 . When these terms were developed, we noted that they both included 
scattering events for particles inside the range T both before and after scattering. 
Show explicitly that these particles are not included in Equation 12.56 by writing 
both as the sum of those events where the particle is in T both before and after, and 
those events where it is not. Show that the common term cancels, and interpret each 
major step of the math in words. 

Ixtrcift 12*7 

Equation 12.61 transforms a divergence expression into a gradient expression. The 
transformation asserts that for a direction Cj and vector function F, 


V * (uF) = u • VF 


( 12 . 100 ) 


Prove this equality. 

Ixtrcift 12.8 

There are five Platonic solids: these are polyhedra formed by multiple instances of a 
regular polygon such that every vertex is identical. In these problems, find the solid 
angle subtended by a single face of each polyhedron. 

(a) The tetrahedron: four triangular faces. 

(b) The cube: six square faces. 

(c) The octahedron: eight triangular faces. 

(d) The dodecahedron: twelve pentagonal faces. 

(e) The icosahedron: twenty triangular faces. 




Difficulties illuminate existence. But they must 
be fresh, and of high quality. 

Tom Robbins 

(“Even Cowgirls Get the Blues,"' 1976) 



RADIOMETRY 


13.1 Introductien 

The vocabulary of quantified light energy comes from the field of radiometry , which 
gives us the tools we need to tie the abstractions of the transport equation to real, 
physically measurable phenomena, and the definitions of terms most convenient for 
discussing that phenomena. 

Actually, radiometry was preceded by many years by photometry , the study of 
how a human observer responds to light. As we saw in Unit I, the human visual 
system has a nonlinear response to light of different frequencies. When we want to 
discuss light energy in the abstract, we are best off with radiometry, which doesn’t 
drag the human observer into the discussion. We will see that we can always convert 
a radiometric term to a photometric one when desired. 

This chapter is based on the standard reference works by IES [221], Nicodemus 
et al. [318], and Siegel and Howell [406]. Some material is also based on Hanrahan 
[188] and Kajiya [235], which discuss radiometry for computer graphics. 
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13.2 Radiemetric Conventions 

Radiometric terms describe physical quantities; most radiometric terms can be mea¬ 
sured in practice with lab instruments. 

Every radiometric term that we will encounter is a function of wavelength, time, 
position, direction, and polarization, and will vary with each of these dimensions. So 
any radiometric value g would be fully written as g( A, t, r, u, 7), which is too bulky 
to manipulate conveniently. It is traditional in the literature to keep the notation 
simple and to suppress many of these dependencies. In particular, we will make three 
standard assumptions, all of which can be relaxed. 

First, we will suppress any dependence on polarization. 

Second, we will assume (as we did during the development of the transport 
equation) that energy of different wavelengths is decoupled. That is, the energy 
associated with some region of space, or surface, at wavelength Ai is independent of 
the energy at A 2 . This allows us to set the wavelength parameter to some constant 
in all of our discussions, so it need not be explicitly present in the equations. This 
assumption precludes modeling the important phenomenon of phosphorescence , 
where energy is absorbed at one wavelength and reradiated at another. 

Third, we will assume that there is no time-dependent behavior in the system. 
Essentially we are assuming that light travels infinitely fast, and that we are dealing 
with a stable (or equilibrium) situation. This excludes luminescence , which is the 
phenomenon whereby a material radiates light energy that it absorbed at some 
previous time. 

Although we are excluding phosphorescence and luminescence, it is only for 
notational simplicity during the development of our energy model. These phenomena 
will be easily reincorporated into the model in Chapter 17. 

These simplifications allow us to get our terms down to five scalar variables, 
three for position and two for direction, which we can write conveniently as just two 
vector quantities. So in this chapter we will characterize light in terms of position r 
and direction u;. 

13.3 Notation 

The radiometric literature is filled with conflicting and confusing sets of definitions 
and units for measuring energy. Furthermore, much of that notation has shortcom¬ 
ings. A major disadvantage of the standard notation is that none of the measures 
are vectors. In fact, we will find some terms that appear identical, yet have differ¬ 
ent names depending on whether the quantity being measured is flowing toward or 
away from a surface, a distinction that would be handled easily by vector notation. 
Worse, this naming is not even consistent, so that even the most important term 
(radiance) must always be identified (usually by a subscript) as incident, or reflected 
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with respect to a surface. Kajiya has pointed out that projected areas and projected 
solid angles are used in radiometry precisely to compensate for the lack of vector dot 
products [235]. 

I was very tempted to write the radiometry in this book with vector notation. But 
that would have only added yet another set of notation incompatible with all the 
others, and I felt that there were already too many competing standards. Instead, 
the terms and units in this book follow the conventions adopted by the American 
National Standards Institute (ANSI) [432] and the Illumination Engineering Society 
of North America (IES) [221]. The ANSI/IES notation is the closest thing we have to 
a standard, and familiarity with it allows us to read at least some of the radiometric 
literature. A complete summary of radiometric terms and units appears in Table 13.1. 

This lack of explicit directional terms doesn’t remove our need to discuss the 
direction of flowing energy, it just makes it harder. In this book we will use the bold, 
uppercase Greek phi (<I>) to refer to the direction of flowing energy; this reminds us 
that we are discussing a vector quantity with magnitude given by the flux 3> = ||3>||. 

My one bow to notational revision is to avoid the use of conventional terms like 
A p for projected area and ft for projected solid angle, since they leave us completely 
uninformed about what direction the term is being projected into. Following the 
terminology of Chapter 12, if any term u is projected into a direction F, we write 
that as u F . We will also continue to distinguish different types of solid angles as in 
the last chapter. To recapitulate, directions, differential solid angles, and finite solid 
angles are represented, respectively, by d;, dcj, and T. 

All radiometric terms come in two flavors: radiometric and spectral radiomet¬ 
ric. A spectral radiometric term describes some measure of light at a particular 
wavelength, such as E( A). The regular radiometric terms describe that measure as 
integrated over all wavelengths: E = J 0 °° E( X) dX. We can also think of each term as 

a function of an interval of the spectrum, e.g., E = f** E( A) dX. For simplicity, we 
will write most of our terms without the specificity of wavelength. Because we are 
assuming linearity of materials with respect to frequency, any radiometric quantity 
in this chapter may be interpreted in any of these ways. 


13.4 Spherical Patches 

We will often project solid angles and spherical patches onto the base of a hemisphere. 
We will use the vector notation N to refer to the normal at the base of the hemisphere. 
So if a patch dA is projected onto the base, we write that as gL 4 n ; a projected solid 
angle is du; N or T N . 

We will make extensive use of differential and finite solid angles in this chapter. 
It will be useful to find the solid angle occupied by a small rectangular patch of 
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Definition 

Name 

Unit 

Qc 

Radiant energy 

joule (J) 

Qv 

Luminous energy 

talbot 

u e = dU e /dV 

Radiant flux density 

joule/m 3 

u v = dU v /dV 

Luminous flux density 

talbot/m 3 

*e = dU e /dt 

Radiant power (flux) 

watt (W) = J/sec 

= dUp/dt 

Luminous power (flux) 

lumen (1m) = talbot/sec 

= d<t>Jd\ 

Spectral radiant power 

W/m 

W e = d$e/dA* 

Radiant power density 

W/m 2 

W v = d* v /dA* 

Luminous power density 

lm/m 2 

Wx = dWe/dA* 

Spectral radiant power density 

W/(m 2 • m) 

Ee = d<t>e/dA* 

Irradiance 

W/m 2 

E v = d$ v /dA* 

Illuminance 

lm/m 2 

E x = dEe/dA? 

Spectral irradiance 

W/(m 2 • m) 

Me = d<t>e/dA+ 

Radiant exitance 

W/m 2 

M v = di> v /dA* 

Luminous exitance 

lm/m 2 

= dMe/dA* 

Spectral radiant exitance 

W/(m 2 ■ m) 

le = ddXe/ddj 

Radiant intensity 

W/sr 

l v = dQv/duj 

Luminous intensity 

candela (cd) = Im/sr 

lx = dle/dw 

Spectral radiant intensity 

W/(sr • m) 

Le=d$J(d2dA*) 

Radiance 

W/(sr • m 2 ) 

L v = d±J(<E2dA*) 

Luminance 

lm/(sr ■ m 2 ) 

Is \ — disc d\ 

Spectral radiance 

W/(sr • m 3 ) 


TABU 13.1 

Radiometric, spectral radiometric, and photometric terms. 


the sphere. Consider Figure 13.1, where patch dA with dimensions d6 by dip is 
illustrated. The vertical side of this patch subtends an arc of length d6 on a great 
circle of radius r, so its length is rdO . The horizontal side subtends an arc of length 
dip , but the radius of that circle is rsinfl, as shown in the figure. Thus the area of 
this patch is given by the product 


dA = (r sin Q dip) (rdO) = r 2 sin 6 d6 dip 


(13.1) 
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PIOUII 13.1 

A differential patch dA and its associated solid angle Q. 


The differential solid angle associated with this patch is then the area of the patch 
divided by the sphere’s radius: 


du — — — sin OdOd'ip (13.2) 

We can now find the projected differential area and projected differential solid 
angle of this patch by projecting both quantities onto the plane of the hemisphere 
(which is p^J|^ to the N vector). The projected area dA N is 

dA™ = t 2 sin 9 cos 0 dO d'tf; (13.3) 

and the projected differential solid angle du™ is 

duj N — sin 6 cos OdOdip (13.4) 


13.5 Radiometric Terms 

We begin with the basic unit of energy: radiant energy is denoted Q, which is 
measured in joules (J). In terms of our particle model, each particle may be thought 
of as carrying some number of joules of energy. 
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Recall from Equation 11.28 the basic energy-frequency relationship 

E = hv [J ] (13.5) 

which specifies an energy E for every frequency v . 

The amount of energy per unit volume may be described by the radiant energy 
density w: 

w = Q/V -4 (13.6) 

m _ 

This can be thought of as the combined energy of each photon in the volume, divided 
by the size of the volume. 

We now turn our attention to measures of moving energy. As we saw in the last 
chapter, the energy flowing through a surface per unit time is called the flux. In 
radiometry, it is called radiant power or radiant flux $ at that surface: 

$ = dQ/dt W = watt = — (13.7) 

L 8 J 

The unit of radiant flux is the watt (W), which is one joule per second. 

The interaction of radiant energy and matter requires a description of the flow of 
energy toward or away from a surface. A convenient measure of energy flow is to 
find the incident or departing flux per unit of surface area. This measure is known 
as radiant flux area density u: 

r w\ 

u = d$/dA (13.8) 

|_ra J 

The area dA used in the definition of flux density need not be part of an actual 
surface; indeed, dA can simply be an imaginary 2D surface in space. If the flux over 
the region is uniform over a finite surface, we may drop the differentials and evaluate 
u = $/A. Note that we have used dA and not its projected area. 

Radiant flux density is a useful measure. But because it is a scalar, we don’t know 
whether the flux is arriving at a surface or departing from it. Two terms similar to 
radiant flux density allow us to make this distinction concrete. If energy is arriving 
at a surface, we call it the irradiance E : 

E = d$/dA ^ (13.9) 

m _ 

The measure of flux leaving a surface is called the radiant exitance M (in computer 
graphics, this quantity is also called the radiosity B): 



B = M = d$/dA 


(13.10) 
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The definitions of radiant flux density, irradiance, and radiance were all in terms 
of a piece of surface area. Our definitions require that the source have some area, 
differential or finite. When we want to represent a point source, the area goes to 
zero and we have a problem. 

An alternative is to define the ratio of flux with respect to solid angle rather 
than area; this then works well for describing radiation arriving at or leaving from 
point sources. We may then define a term corresponding to exitance that finds the 
rate of change of flux from a point as a function of solid angle. This measure of 
radiant energy leaving a point, in the direction $, per unit solid angle, is called the 
intensity I: 


I = d$/dCj 


W' 

sr 


(13.11) 


An alternative interpretation of intensity is that it describes the radiant energy leaving 
a point per unit area at unit distance. Note that the word “intensity” is highly 
overloaded in computer graphics, used casually to refer to everything from the 
power of a light source to its perceived brightness. 

Finally, we can combine the ideas of solid angle and area into what is perhaps the 
most important radiometric term. The power arriving at or leaving from a surface, 
per unit solid angle and per unit projected area, is called the radiance L : 


d 2 3> _ d 2 $ [ W 

dA * du dA du* . sr • m 2 


(13.12) 


Notice that we need to use either the projected solid angle or the projected area when 
working with radiance; in other words, we have introduced a cosine term. 

The radiance may alternatively be expressed in terms of the intensity, irradiance, 
or the exitance: 

L _ _dl_ _ dE_ _ dM_ 

~ dA* ~ du*~ du* 


W 


sr • m 


(13.13) 


When we use the intensity (or irradiance or exitance) to find radiance, we need 
to use a projected value to get the cosine into the expression. We will see the value 
of defining radiance in this way in Section 13.6.1. 

The radiometric definitions presented here are summarized in Table 13.2. 


13.6 Radiometric Relations 

The definitions in the previous section may be linked to provide useful relationships 
between a differential source patch dS radiating energy and a differential receiver 
patch dR upon which some of that energy falls, as shown in Figure 13.2. The patches 
have normals Ns and N/*, respectively. 
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Radiant term 

Symbol 

Definition 

Unit 

Energy 

Q 

— 

j 

Energy density 

W 

dQ/dV 

J/m 3 

Power (flux) 

<t> 

dQ/dt 

W 

Flux area density 

u 

d<P/dA 

W/m 2 

Intensity 

I 

d<P/duj 

W/sr 

Exitance (radiosity) 

M 

d<P/dA 

W/m 2 

Irradiance 

E 

d<t>/dA 

W/m 2 

Radiance 

L 

d 2 <P/(dA dJj cos 6) 

= dl/dA * 

= dE/dJj* 

? 

-1 


TAIL! 13.2 

Radiometric definitions. 
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Patch geometry. 
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The vector $(S, R) connects a point S e dS to a point R e dR. Because both dS 
and dR are small, the three values ||#(S, R) ||, $(S, R) • N$, and $(S, i?) • can 
all be considered constants over all points R and S. So we simply write the vector $ 
to represent the transfer from S to R. The patch dS presents a projected area dS* 
and occupies a solid angle dujs from any point R € dR , and similarly for dR . 

Consider the term dA* duj in the definition of radiance in Equation 13.12. In 
terms of the transfer from dS to dR , we would have dS* (Ljr. Let’s expand this 
term, writing A(dS) and A(dR) to refer to the areas of the patches, r to represent 
the distance between them, and Os and Or to refer to the angle made by the normal 
to each patch with the transfer vector $. 

dS* dujR = [yl(dS) cos#s] [A(dR) cos Or/ r 2 ] 

= [A(dS) cosOs/r 2 ] [A(dR) cos Or] 

= dR*du s (13.14) 


The substitutions for the solid angles here used the approximation that for a small 
patch at radius r, the solid angle subtended by the patch is nearly the same as the 
projected area of the patch divided by r 2 . We call Equation 13.14 the principle of 
reciprocity of transfer volume . It says that the product of the size of a projected 
patch and the solid angle occupied by another patch is the same if the calculation is 
formed by reversing the labels on the patches. 

We can use this principle immediately. From the definition of radiance in Equa¬ 
tion 13.12, we can write an expression for the incident radiance Lr falling on dR in 
terms of the incident flux &r: 


dL R = 


d$R 

du R dS* 


Using the principle of reciprocity of transfer volume, we find 
d$ R = dL R dw R dS* = dL R dR* duj s 


(13.15) 

(13.16) 


Equations 13.16 are called the flux-radiance relations [235]. 

Combining Equation 13.16 with the definition of irradiance in Equation 13.9, 
we can equate equivalent expressions for d$ and solve for the irradiance in terms of 
the radiance: 


Er dR* = d$ R = dL dR* duj s 

Er = du s dL (13.17) 


Equation 13.17 is the irradiance-radiance relation . It tells us that the irradiance 
Er falling on a receiving patch dR due to a source patch dS is equal to the radiance 
dL of the source times the differential solid angle duis of the source as seen from the 
receiver. This is an important relationship, because we will find it useful to express 
the outgoing radiance at a point in terms of the incident irradiance. 
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13.6.1 Discussion of Radlaitcu 


Radiance is a fundamental measure in image synthesis; although it’s flux that actually 
measures the power moving through the environment, many synthesis methods are 
based on the radiance because it is more convenient. It is important to stop for a 
moment and lock down the interpretation of radiance. 

The definition of radiance is similar in spirit to intensity and irradiance, but it 
contains an extra cosine term the others do not. To recapitulate, the radiance leaving 
a point r on a differential source patch dS into a solid angle duj around direction <2, 
making an angle 9 S with the source patch, is 


L(r 3) = = = dEj^u) 

du dS cos6 s dS cos 6 S dS cos 9 S 


(13.18) 


To see where the cosine comes from, let’s forget about radiance for a moment 
and imagine a density function of position and direction P(r,u;), defined over the 
source patch dS . When we integrate this function over some volume of phase space 
(that is, some combination of source points and directions), we will get the number 
of photons that will be generated by the surface into that region of phase space. 
Since we will assume everything is a differential, we can find this total flux by simply 
scaling the function P(r, u) by the volume dr duj of phase space: 


$(r,u;) = P(r,uj)drdw (13.19) 

Assuming that we know P, all that’s left to find the flux is to find an expression for 
the phase-space volume dr du. Consider the transmission of flux from a differential 
source patch dS to a differential receiver patch dP, as shown in Figure 13.3; the 
vector V joins the centers of the two patches. The receiver is perpendicular to V, 
though the source is not. The two patches define a tube in space, which contains 
the flux. In the figure we see du s , the solid angle of the source as seen from the 
receiver. The flux flows through this solid angle to every point on the receiver; thus, 
the phase-space volume of the tube is the size of the solid angle times the size of the 
receiver: 

drduj = dRdu s (13.20) 

The total flux is then 

$(r,c3) = P(r,u;) dR du s (13.21) 

Now suppose that the receiver is replaced by a different receiving patch, dR'. The 
new patch has the same center point as the old one, so the line V' joining it to dS is 
the same as the old V. The new patch makes an angle of 9r> with this center line, 
and it has just the right size to fit in the old tube; that is, its projected area is the 
same as the area of the perpendicular patch: dR' cos 9r> = dR. In other words, the 
phase-space volume is conserved: 

dR duj s = dR' cos 9r> duj s 


(13.22) 
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Transfer of energy from a source to a receiver, (a) The solid containing dS and dR. (b) The solid 
angle duj s occupied by dS as seen from dR. (c) A different receiver dR f . 


so we have the satisfying result that the flux traveling down the tube doesn’t change 
just because we’ve replaced the receiver: 

$(r,d;) = P(r,cD) dR' cos#/*/ duj s (13.23) 


In general, then, the density function that we integrate to find the flux traveling 
from dS to dR is 


P( r = d2 fr( r ’ J ) = d2< E( r > J ) 

(dRcos0 R ) du s dR v duj a 


(13.24) 


This is just the radiance L. To see the definition in its traditional form, expand the 
solid angle and regroup: 


L{ r,u$) 


d 2 $(r,u)) 

dR v dw s 

d 2 $( r, uj) 

(dR cos 0R)(dS cos 0 S /|V| 2 ) 

_ d 2 $(r,£) _ 

(dS cos 6 s )(dR cos Or/ |V| 2 ) 

d 2 $(r,uj) 

dS v du r 


(13.25) 


where | V| is the length of the line joining the two patches, and dw r is the solid angle 
of the receiver as seen from the patch. This second form involving the projected 
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A patch projected onto a point source. 


angle of the source can be a bit confusing, because there doesn’t seem to be any 
reason to project the source patch at all. It’s simply the result of an algebraic shuffle; 
the first form has the geometric interpretation that the radiance is that function of 
the surface point and direction, which when multiplied by a differential volume of 
phase space around that point and direction gives the flux emitted into that volume. 

We can now make an important observation. We first note that the flux contains 
an inverse-square term buried within the solid angle, which can be seen from the 
third line of Equation 13.25: 


L( r,d;) 


d 2 $(r,u;)r 2 

dR v dS v 


(13.26) 


where we have renamed the distance r = |v|, and written it in the numerator. The 
presence of this term is no accident. Consider an isotropic point source of light 
emitting a total flux 4> t . Because the source is isotropic, the same flux is distributed 
equally over the surface of all spheres around the point source. Suppose we have 
a patch at a distance and that its projected area onto the source is dA> as in 
Figure 13.4. Then the flux falling on this patch is $ t dA/r\ (assuming that the 
medium is a vacuum). Note that since the radiance L( r,t3) from this source is the 
same in all directions, and there’s only the one point at the source, L( r,u;) = L. 
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Computing the radiance L from this source, we find 

, _ (d 2 $ t /r 2 A )r 2 A 
dR v dS v 

dR v dS v 


(13.27) 


Notice that the distance term r has dropped out! As the patch recedes, the power it 
receives falls off with the distance squared, but the definition of radiance compensates 
exactly. This works fine for any point source, or any differential or finite source far 
enough away that we can consider it a point. 

Although it is flux that is actually transmitted from one patch to another, we 
know that we can find the flux from the radiance at the source and the geometry of 
the two patches; therefore, we often sidestep the mechanics and speak of radiance as 
being transmitted from one patch to another. 

The fact that radiance doesn’t vary with distance (in a vacuum) is essential to 
image synthesis; if we didn’t have a radiometric quantity with this property we 
would have had to invent one. Shirley sums up this essential observation in the ray 
law [402]; which we have adapted here: 

The ray law: The radiance of a point source onto a patch is 
invariant as the patch is moved radially with respect to the source. 

We will often approximate the flux radiated by a small source patch onto a receiving 
patch by finding the radiance at a particular point on the source and then multiplying 
by the volume of phase space defined by the geometry of the two patches. We can 
find this representative source point by ray-tracing (discussed in Chapter 19). If we 
imagine that the ray is a conduit of information from the source to the receiver, then 
the information flowing along that ray is radiance. From the radiance at the source 
and the geometry, we can find the flux arriving at the receiver. 


13.6.2 Spectral Radiomatry 

Recall that each of the radiometric terms introduced in the last section may be 
evaluated at a specific wavelength. It is sometimes useful to explicitly refer to the 
value at some particular wavelength A; thus, irradiance E might actually be written 
E(X) or E\. 

When a radiometric term is written at a specific wavelength, it is called a spectral 
radiometric term . Table 13.1 defines each of the spectral radiometric terms. In 
general, the units of these terms are the units of the radiometric term divided by [m], 
the unit of length. 
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13*6*3 Photometry 

The human visual system is only responsive to energy in a certain range of the 
electromagnetic spectrum. Even within that range, the human visual response is not 
uniform. The concepts of radiometry may be adjusted to represent the perceptions 
of a statistically standard human observer to objective radiometric quantities; the 
new terms are described as photometric terms, and are distinguished from their 
radiometric counterparts by appending the subscript v. 

Recall from Unit I that the human visual system does not respond equally to all 
radiation. In particular, the average person is most responsive to signals that arrive in 
a frequency range from about 380 to 780 nm, sometimes called the visual band . This 
definition includes two approximations. The first is the use of an “average observer,” 
which is a statistical average built from measured data gathered from experiments on 
human volunteers. Some people are responsive to frequencies outside of this band, 
and others are not responsive to everything in the band. Nevertheless, the abstraction 
of an average person is useful in practice. The second approximation is to state that 
the visual system is “not responsive” outside of these limits. In fact, the response 
of the visual system tapers toward the edges of this band, and these wavelengths 
represent somewhat arbitrary cutoff points in the magnitude of the response. 

Nevertheless, by international agreement a curve has been defined called the 
photopic spectral luminous efficiency of the human visual system; it is usually written 
as V(A). This curve is plotted in Figure 13.5, and some values are tabulated in 
Table G.4 in Appendix G. 

By accepting this curve as the response of the human visual system, we can 
find the perceived brightness by an observer in response to an input signal 5(A) by 
modulating the signal by the response curve and integrating the resulting perceived 
energy: fS(A)V(A)dA. 

Modulation of light energy by the spectral response curve forms the basis of a 
branch of radiometry called photometry. Photometric terms are simply radiometric 
terms weighted by V{\) and then scaled to a new set of units. Each of these terms 
replaces the word “radiant” with “luminous” in its name, and use the subscript 
v in its symbol. When radiometric and photometric quantities both appear in an 
equation, we will label the former with the subscript e. 

Luminous energy is denoted Q v and is measured in talbots. For photopic (color) 
vision, luminous energy is found from the expression 


Qv = K m I V (A) Q e (\) d\ 


(13.28) 


By international agreement, the conversion factor K m is defined as 680 lumen/watt 
(the lumen is defined below as the unit of luminous flux). The density of luminous 
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Spectral efficiency curve. 


energy in some volume may be measured by the luminous energy density: 


w v = Q v /V 


talbot 


m 


(13.29) 


The transport of luminous energy over time defines luminous power or luminous 
flux $ v : 


$v = dQy/dt 


talbot 

s 


= lumen — lm 


(13.30) 


The unit of radiant flux is the lumen (lm), which is one talbot per second. 

The other photometric terms parallel the radiometric ones and are summarized in 
Table 13.1. We have used those terms recommended by the ANSI and IES. Table 13.3 
lists some of the other units and their conversion factors to those presented above. 


13.7 Reflectance 

Reflection is the process by which electromagnetic flux (power), incident on a 
stationary surface or medium, leaves that surface or medium from the incident 
side without change in frequency; reflectance is the fraction of the incident flux 
that is reflected. (Nicodemus et al. [318]) 
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Illuminance E v 1 lm/m 2 = 1 lux(lx) = 9.29 x 10“ 2 foot-candles 

= 1 x 10“ 4 phot 
= 1 x 10“ 1 milliphot 
= 1 x 10 3 nox 

Luminous intensity I v 1 lm/sr = 1 candela (cd) = 1 candle 

= 1.04carcel 
= 1.11 hefner 

Luminance L v 1 cd/m 2 = 1 nit 

= 1 x 10“ 4 stilb 

= 9.29 x 10“ 3 cd/ft 2 

= 6.45 x 10“ 4 cd/in 2 

= 7r blondel 

= 7r apostilb 

= 7r x 10“ 4 lambert 

= 7r x 10“ 1 millilambert 

= 2.92 x 10“ 1 foot-lambert 

= 2.92 x 10“ 1 equivalent foot-candles (eqv) 

= 3.2 x 10 4 skot 

= 2.92 x 10 2 glim 


TABLI 13.3 

Other terms used in radiometric literature. Source: Data from ANSI [432] and IES [221]. 


This definition of reflectance makes concrete our intuitive concept of reflection as 
the percentage of light “bounced” off a surface. Just as with volumes, we say that a 
surface scatters the incident light by changing its direction; in this case, from toward 
the surface to away from it. 

The goal of this section is to define a function that clearly establishes this rela¬ 
tionship for different classes of material. The function will return a dimensionless 
number that relates the output flux to the input flux given the geometry of measure¬ 
ment and the reflectance properties of the surface. The development in this section 
follows Nicodemus et al. [318]. We will actually derive three functions in this section: 
the BRDF / r , which describes the ratio of reflected radiance to incident radiance; 
the reflectance p, which describes the ratio of reflected flux to incident flux; and the 
reflectance factor f?, which describes the reflectance of a material with respect to a 
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MOURI 13.6 

A simple geometry for deriving reflection. 


perfect diffuse reflector. In Chapter 15 we will use these functions to define shading 
algorithms, which provide the boundary conditions for the light transport equation. 

In this chapter we consider only reflectance. When light passes through a material, 
it is said to be transmitted ; this effect is called transmission of transmittance . In 
general, transmission is very similar to reflectance with a couple of minor additional 
wrinkles. It is traditional to study reflectance in some detail, and leave transmission 
out of the discussion entirely until it can be easily included. Here we follow that 
model and discuss only reflection. 


13.7.1 Ihm BRDF # r 

We begin with the diagram of Figure 13.6, which shows some energy arriving at 
point P on a differential surface dA through an incident differential solid angle dcD;, 
pointing in direction iD*. We would like to find an expression for the differential 
radiance dL r reflected by the surface from point Q into a direction uj r due to the 
incident flux d$i arriving through the differential solid angle duj{. 

In general, the reflected radiance dL r will be proportional to the incident flux 

dh r <xd& (13.31) 

Spelling out all the dependencies, and naming the proportionality constant 5, we 
find 

dL r (duji,P;uj r ,Q) = S{duj u P^ r ,Q)d^ i {duj u P) (13.32) 

Note that the illumination arrives through a differential solid angle dw^ but we are 
measuring the emitted flux only along a direction u; r . Also remember that all of our 
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equations may be interpreted for a specific wavelength, or a finite or half-infinite 
interval of wavelengths. 

The constant of proportionality 5 depends on the two vectors uj t and u; r , and the 
points P and Q of incidence and reflection. The function S is called the bidirectional 
scattering-surface reflectance-distribution function , or BSSRDF [318]. The adjectives 
refer respectively to the fact that the function depends on both directions, it tells how 
light is scattered (or reflected) at a surface, and it is a distribution function, in the 
sense that it may contain distribution (or generalized) functions such as the Dirac 
delta function tf(x). 

The BSSRDF is a very high-level description of reflection. However, the BSSRDF 
is a difficult function to measure, store, and compute with, due to its dependence 
on four vector variables. We will now make some simplifying assumptions about 
the surface and its illumination that give us a more computationally convenient 
expression, following Nicodemus [318]. 

First, we assume that the incident radiance L l (Qi, P) has the same cross section 
over all points on A. That is, for any direction C3 6 dwu the radiances L l (uji , Pi) and 
P 2 ) are the same for every Pi and P 2 . Then we can write the incident radiance 
from direction u; simply as L l (cJ), and leave out the argument P. 

It will be useful to express the incident flux $ in Equation 13.32 in terms of the 
irradiance E . Here we write the irradiance E(dui) to indicate that it is a function of 
the incident solid angle. Then we can write the flux as 

d9 i (dt3 in P) = dE(di3i)dA 

= L i {du i )dw?dA (13.33) 


Since d$ strikes each differential area dA e A , the total reflected radiance dL r 
comes from integrating d$ l (duji, P) over all points P in A: 


dL r (du)i',uj r ,Q) = / dL r (dtii,P;u) r ,Q) 

JpeA 


[ S(cL}„ 

P;LO T ,Q)d$ l (daji, 

P) 

JpeA 



[ S(dZi, 

P;u r ,Q) dE(d£i) 

dA 

JpeA 



dE(diji) / S(duJi, P;o5 r ,Q) 

JpeA 

dA 


(13.34) 


where we have used the fact that since L l is independent of position, so is dE. 

Now assume that the material is isotropic . That is, S depends only on the angle 
between the incident energy and the normal, so S is rotationally symmetric about 
the normal. Symbolically, S(0 , VO = S(0,jp') for any as in Figure 13.7(a). 

Furthermore, if the material is uniform , then its properties everywhere on A are 
the same; that is, the position of P doesn’t matter. Therefore the only relationship 
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(a) Isotropy, (b) The distance between P and Q. 


between P and Q that matters is the distance r between them, as in Figure 13.7(b). 
So we can eliminate P and Q from our expressions, leaving only r: 


dL r (duji,uj r ) = dE(duJi) / S(d(j3i,uj r , r) dA 

J A 


= dE(dwi)f(duJi,C2 r ,r) 


(13.35) 


where 


f(dwi,<2 r ,r) = J S(duJi,uj r ,r) < 


(13.36) 


for r = \P — Q\. The most common notation for this function rolls the r argument 
into the name of the function, giving us 


fr(didi, cd r ) — / S{dwi,u r ,r)dA 
J A 


(13.37) 


If we now allow Q r to become a differential solid angle du r , we can write this 
function as f r (duJi —K duj r ). This expresses the proportion of incident flux reflected 
from dSi into dw r over all of A . Solving Equation 13.35 for / (now f r ) gives us 


fr{d£i -4 dw r ) = 


dL r (duji,du r , E) 

dE(dui) 


(13.38) 
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Geometry for the BRDF. 


or, in terms of incident radiance, 


f r {dJi — ► (luJ r ) — 


dL r (diJi, djjn E ) 
L l {dQi ) du/f* 



(13.39) 


The function f r defined by Equation 13.39 is called the bidirectional reflectance 
distribution function (BRDF); its geometry is illustrated in Figure 13.8. The BRDF 
is the fundamental description of how a surface reflects. It characterizes the surface’s 
reflectivity in terms of the incident solid angle, the incident light, and the reflected 
solid angle. Because the incident radiation can come from a solid angle with no size 
(that is, just a ray), the BRDF can take on any value from 0 to infinity. We will 
remedy this problem shortly. 

There are two simple but important properties of the BRDF that should be kept 
in mind when working with these functions. 

The first is reciprocity , which simply states that if we reverse the roles of the 
incident and reflected energy, nothing changes. That is, if we send some amount of 
energy through solid angle and measure the energy radiated into solid angle T r , 
then if we send that same amount into solid angle T r , we will measure the same 
propagated energy coming out of T*. This rule was first stated by Helmholtz and is 
sometimes known as the Helmholtz reciprocity rule . 

The second important property of the BRDF is that it must be normalized. That is, 
the total energy propagated in response to some irradiation must be no more than the 
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energy received, and usually will be less. Although one can build mechanical devices 
that implement arbitrary BRDFs [313], real physical materials never propagate more 
light than they receive. 

In Chapter 12 we saw that volume scattering was linear; that is, multiple inde¬ 
pendent fluxes passing through a volume summed together. Similarly, in almost all 
physical materials, surface scattering is linear. This means that the energy arriving 
from each direction contributes independently to the reflection. In this case, we can 
find the total reflected radiance in Q r by simply summing together the contribution 
from each direction. We do this by integrating over the incident hemisphere Qj. 

Integrating the incident radiance U over the incident hemisphere, we find 

L r (*r) = [ frVi -»> OrWifii) ^ 

r> 2 ir pn /2 

= / fr{(0uil>i) (O r ,ipr))L l {0i^i)cos0isin0id0idxpi (13.40) 

J(f>i= 0 J9i= 0 

using the expanded value of du f* from Equation 13.1. Equation 13.40 is called the 
reflectance equation [188]. 

The reflectance equation satisfies the Helmholtz reciprocity principle , which states 
that we can reverse the direction of flow of energy and nothing will change. That is, 
it doesn’t matter in which direction we calculate the transfer. In symbols, 


fr{duj r -> duJi) = f r (duJi -» duj r ) (13.41) 

some authors write f r (dC3i dw r ) to emphasize this property. 


13*7.2 Reflectance p 

The BRDF is a useful characterization of reflection, but it can take on values from 
0 to infinity. A related measure is the ratio of the reflected flux to the incident flux, 
which due to conservation of energy always lies between 0 and 1 (a patch can never 
reflect more flux than it receives). This measure is called the reflectance , and is 
denoted p. In symbols, we define the reflectance p as 


P (iwr r ) = 


<*<E r (r r ) 


(13.42) 


for incident and reflected finite solid angles T, and r r . 

For a differential solid angle doji, we can find the incident flux from 

Equation 13.33: 


d&idCui) = d A 


(13.43) 
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So for a finite solid angle Tj, we need only integrate over all directions in the angle 
to find the total incident flux: 

d&( r<)= f L^dXJdAduS? 

JdZitTi 

— dA f £*(<£;<) dS?* (13.44) 

Jdwi Er i 

We find the reflected flux in the same way, getting 

d$ r (r r ) = dAf L r {du r )dw? (13.45) 

J duj r £T r 

To evaluate Equation 13.45, we need to find an expression for the reflected 
radiance L r . This just requires integrating the differential reflected radiance 

dL r (duji',uj r \E) over all di2i € T*: 


U 


[dWr) = [ 

JdZie r t 

-/ 

JcLZ', gt * 


dtD.er, 


dL r (dujj;uj r ; E) 
f r (duiji —y uj r ) dE(duJi) 


f r (du>i —y LJ r )L*(ckjjj) cLj^ 


(13.46) 


Where we have used the definitions of the BRDF f r and irradiance E. Substituting 
Equation 13.46 into Equation 13.45 gives us the reflected flux: 

d$ r (r r ) = dA f f f r (dui -> <2 r )L\dui)du}?di2y (13.47) 

JduJ r £X' r JdxJi^X'i 


We can now form the ratio of the fluxes from Equations 13.44 and 13.47: 


d$ r (r r ) 

d&Fi) 


/- /- 

Jdu) r £Y' r Jduii^zY'i 


fr(d£3i -y S r )U{dZi) dw t N dJ N 


j 

J duj. 


dQiGTi 


L l {dwi)dw\ 


N 


(13.48) 


Notice that the differential areas dA cancel out. 

This is still too complicated to work with efficiently for image synthesis. To 
simplify Equation 13.48, assume (as before) that L l is constant, so that it doesn’t 
depend on the direction u; at all; that is, L*(u5) = L 1 . Then L l factors out of the 
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numerator and denominator of Equation 13.48, and we get the reflectance formula: 

[ [ fr(d£i -> CS r ) dw t N dw™ 

J du) r £Y r Jduii^ri 


p(r< -> r r ) = 


^ r (r r ) 


[ da? 

J duji g r i 


[dimensionless] 


(13.49) 

This equation was our goal in this section. Equation 13.49 is sometimes written 


p(Ti r r ) = i. f [ f r (dwi -> CSr) da? dJ? ( 13 . 50 ) 

JdujrETr 


where 


= I dwf* = I cos OiduJi 

J duJi^X' i J 


(13.51) 


We note that when the input angle is a hemisphere (that is, = fii), then the 
integration of 'P becomes simple: 


*-L*? 

-/ 

Jrti 


cos 9 dS 

2i r /*7r/2 


J t/’=o Je =o 


cos 0 sin 0 dO dip 


= 7r 


(13.52) 


Typos off Roffloctonco 

The reflectance p depends on only three things: the BRDF / r of the surface, and the 
incident and reflected solid angles T* and T r . Nicodemus et al. [318] have suggested 
distinguishing three types of solid angles, name directional , conical , and hemispher¬ 
ical. Hanrahan [188] has suggested the more descriptive names differential , finite , 
and hemispherical. We associate the symbols da;, T, and fi* with these classes. 

Each of the incident and reflected solid angles may take on any of these three 
values, resulting in a total of nine types of reflectance. The names of the six mixed 
types are formed by combining the adjective for the incident solid angle type with 
the reflected solid angle type; the three homogeneous types are preceded with “bi.” 
The symbols and names for all nine reflection types are shown in Table 13.4. 

There are three things to note in this table. First, one integral drops out for 
each differential angle involved, as one would expect. Second, the terms involving 
a differential incident solid angle are differential factors themselves. Third, the 
biconical (or bifinite) reflectance subsumes all the others, if we allow the finite angle 
to become as small as a differential or as large as a hemisphere. 
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r r 


r< 

duj 

r 

Do 


dp (duj -* duj) 

p(duj —¥ r) 

p(duj — > 0 o ) 

duj 

fr() da? 

Bidifferential 

[ M)dup 

J r r 

Differential-finite 

[ M)dO? 

Jo o 

Differential-hemispherical 


dp(r —» duj) 

p( r -»n 

p(T —► n 0 ) 

r 



*Uj 


Finite-differential 

Bifinite 

Finite-hemispherical 


dp(Qi —► duj) 

p(«» -» n 

p(Qi —► D 0 ) 


^ / /rOdaf* 

n Jtoi 

- f f M)da?da? 

n JQi JT r 

- II fr()da?dSt* 

n JQi Jn o 


Hemispherical-differential 

Hemispherical-finite 

Bihemispherical 


TABLE 13.4 

The nine types of reflection functions. 


13*7*3 Rotfloctanco Factor R 

The reflectance factor (denoted R) is the ratio of the reflected flux from a surface 
to the flux that would have been reflected by a perfectly diffuse surface in the same 
circumstances. 

We can form an expression for R by simply finding the ratio of the reflected flux 
to the flux that would have been reflected by a perfectly diffuse surface; that is, one 
with a constant BRDF f r , p d = l/ir (see Section 13.8.1). Using Equation 13.47 to 
give us both fluxes, we can easily find their ratio: 

^I 

* pd (dA/n) ^ U 


a 


N 


duj 


N 


duj 


;N 


(13.53) 
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r, 

duj 

r 

ft c 


R(duj —y duj) 

R(dw -> r) 

R(duj —y n o ) 

duj 

*M) 


/ frOdZ? 



Jn 0 


Bidifferential 

Differential-finite 

Differential-hemispherical 


R( r -* dw) 

R( r -> r) 

R( r -► Q 0 ) 

r 

wl'- 0 * 




Finite-differential 

Bifinite 

Finite-hemispherical 


R(Qi —y duj) 

R(Qi -> T) 

R(ft, -+ ft») 

ft, 

f frQdw? 

jQi 

i / / frOd^dO^ 

r Jfti Jr r 

- [ [ fr()dJ?dJ? 

n JQi Jn a 


Hemispherical-differential 

Hemispherical-finite 

Bihemispherical 


TAIL! 13.5 

The nine types of reflection factors. 


Again assuming that the incident flux L l is isotropic, we can pull it out of both 
integrals and find 


R(Ti r r ) 


7T 


(Jr*) (Jr. 




N ^1 


L L 


fr{^i UJ r ) duf* 


[dimensionless] 


(13.54) 


There are nine types of reflectance factors, named in the same way as the nine 
reflectances. Their definitions work out to be slightly different in the normalizing 
coefficients; the results are summarized in Table 13.5. 
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13.8 Examples 

In this section we will examine the BRDF for two special cases: perfect diffuse, and 
perfect specular reflection. 


13*8« 1 Perfect Diffuse 

For perfect diffuse reflection, we know that incident light is reflected equally in all 
directions. The cosine term in the common form of Lambert’s law is taken care of 
automatically in the definition of the reflection equation; it accounts for how much 
the surface is turned away from the incident light. So the BRDF is simply a constant, 
often denoted / r ,<y, and takes no arguments: 

fr = fr,d (13.55) 

If we now plug this into the reflectance equation from Equation 13.40, we find 

Lr,d(^r>Vv) = / fr(^i ^ Uf r ^)L ) diO^ 

JQi 

= fr,d [ 

= fr,dE (13.56) 

so that 

fr4 = LrA)/E (13.57) 

Perfect diffuse reflection says that the energy arriving from any direction is com¬ 
pletely reflected back into the incident hemisphere with uniform intensity. We can 
express this in symbols by asserting that the differential-hemispherical reflectance 
must equal 1: 


d$ T 

1 = p{Si ->■ = ^r 


Er,d dA 


[ r N 

JOj 


EdA 
f r ,d E dA 


[ day 

Jn t 


EdA 


— fr,d 7T 




So solving for the BRDF, we find 


fr,d — 

7r 


(13.59) 
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PIOURI 13.9 

(a) A hemilune of the sphere, (b) A spherical sector of the sphere. 


13*8.2 Porfoct Specular 

To set up the BRDF, it is helpful to break up the analysis into two sections, geometric 
and radiometric, as presented in Hanrahan [188]. 

For perfect specular reflection leaving the surface in the direction (0 r ,Vv)> the 
geometry of the situation tells us that the incident light must have come from 0* = 0 r , 
and it must be in the same plane as the reflected light and the normal; that is, 
ipi = ip r ± 7r. The radiometric observation is that the radiance that leaves the 
surface in (6 r ,ip r ) is the same as the radiance arriving in the incident direction, so 
L r (0 r , lp r ) = T l (0 r , Vv ± ?r). Our goal is to find the BRDF f r that gives this behavior. 

We begin with the double-integral form of the reflectance equation from Equa¬ 
tion 13.40: 


r2n ri r/2 

L r ($ntl>r) / / / r ((M;) (0 r ,^r))^(0,^)cos0sin0d0d0 (13.60) 

Ji {’=o Je =o 

Our approach is motivated by separating the integrals in this equation. We can 
think of it as two nested integrations: the inner one for 0 scans hemilunes of width dO 
on the sphere, as in Figure 13.9(a), and the outer integral for scans spherical sectors 
of height dip, as in Figure 13.9(b). If we select only the hemilune corresponding to 
0 r and the spherical sector corresponding to ip r ± 7r, their intersection isolates the 
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direction we seek. So we will first look at the integral for 0, giving us a BRDF 
and then similarly look for a BRDF f r ^. Their product will be the BRDF f r . 

We begin by fixing ip = ip 0 , so we need only look at 0. The condition 0* = 0 r 
implies 


pn/2 

L r (0 r ) = L l {0 r ) = / L l (0i)f r> e(0 r -> 9i) cos 6, sin dOi 

J0i= 0 

We can make the change of variables 

u = cos Qi du — — sin 0* dOi 
Equation 13.61 with this substitution gives 

L r (0 r ) = — / frfiur -> u)L l ( cos 1 u)udu 
Ju -1 

Equation 13.63 has a form similar to the one that defines the Dirac delta function. 
Recall the third definition of the delta function from Equation 4.23: 


(13.61) 


(13.62) 




[ 6{t - c)g{t) dt = g(c), c € [a, 6] 
J a 


(13.64) 


If we guess that the BRDF f r j is a delta function f r # = 6(u - u r ), then we can write 
g(u) = L*( cos -1 u)u in this definition, giving us 


L r (0 r ) = — f 6(u — u r )g(u ) du 
Ju=l 

This equation sifts out only #(u) from the integral, leaving us with 


(13.65) 


L r (0r) = -fl(tir) 

= —L l (cos _1 u r )u r 

= -L l { cos -1 cos0 r ) cos0 r 

= — L*(0 r ) cos0 r (13.66) 


So the function we guessed for f r # is close, but off by a factor —1/ cos0 r , which is 
easily incorporated. So the BRDF for perfect specular reflection is 


fr,o{Oi Qr) ~ 


—5(cos0i — COS0 r ) 
COS0 r 


(13.67) 


Now we fix 0 = 0o, and look at ip. The integral for ip has a much simpler form: 


L r {ip r ) = ± 7r) = [ L\ip) dip 

Jip=0 


(13.68) 
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We can capture just the values at 0* + n and 0* - n by multiplying this equation by 
the sum of two delta functions, one at each location (note that only one will have a 
nonzero value for any value of 0 € 2n: 


/r,V> = Wr - - 7r) + £(0 r - 1 pi + n) (13.69) 

Combining the two BRDFs, we find the composite BRDF f r , s for perfect specular 
reflection: 


0i) ^ (^r>0r)) 


= fr,6 • 

6(cos0i - cos0 r ) re/ , . x C // , m 

= --- * [£(0r “ “ 7T) + <*(0 r - + 7T)] 


(13.70) 


13.9 Spherical Harmonics 

Many of the quantities discussed in this chapter may be parameterized as functions 
of direction around a particular point. We can think of them as 2D functions defined 
on a sphere, parameterized by the angles 0 and 0. But unlike 2D functions in the 
plane, there are built-in border periodic conditions due to the shape of the sphere. 

There exists an infinite family of orthogonal functions called spherical harmonics . 
Like the Fourier transform’s complex exponentials, any function on the sphere sat¬ 
isfying some fairly broad conditions may be represented by an infinite sum of scaled 
spherical harmonics; truncating the infinite expansion gives us a finite approximation 
to the function. 

The real and normalized forms of the spherical harmonics are given by a function 
Yi, m in two variables, the order, Z, and the degree , m [410]: 




f ZV/, m P/,m cos 0 cos(ra0) if m > 0 

< Ni y oPi,o cos 0/V2 if m = 0 

Ni tm Pi- m cos0sin(-ra0) if m < 0 


where the normalizing constants 7V/, m are given by 


N ly m = 


( 2/ + 1 (/ — \m\)\ 
2n (l + |m|)! 




(13.72) 


and the P/, m (Z) are associated Legendre polynomials , defined by the set of recurrence 
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relations 


= (1 - 2m)\/l - 
P ~ t(2m 4* 1)-^ 

: ' (f^) «-•”«> - 

Po,o(0 = 1 (13.73) 


The first few spherical harmonics are shown in Figure 13.10. 

Because the spherical harmonics as defined above are orthogonal, they are their 
own duals. Thus for a function f(0, 0), the coefficient 6/, m on harmonic F/, m may 
be found from the projection of / onto V/, m 





/(0, (p)Yi,m {0, <t>) sin(0) d0d(j) 


(13.74) 


or, in braket form, 

bl,m = (f\Yl,m) S 2 (13.75) 

where we have defined the spherical braket (a\ b) s2 representing integration over a 
sphere. 

One useful property of these functions is that when l+m is even, the corresponding 
spherical harmonic is symmetrical about the equator 0 — n/2. So if our function / is 
antisymmetrical (that is, f(Q, (j>) = — f(n - 0,0)), then the total integral against the 
corresponding spherical harmonic has the same magnitude but opposite sign above 
and below the equator, sending the coefficient 6/, m to zero. 

It is sometimes convenient to refer to the spherical harmonics using a single index 
k. The correspondence between the indices is set up naturally, starting at l = 0, m = 0 
and then ascending in order through increasing values of m for each increasing value 
of /. That is, 

k = m + 1(1 + 1) (13.76) 


and 


l = Vk 

m = k — (l 2 + 1) 


(13.77) 
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m 



The first few real spherical harmonics. 
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13.10 Furthor Reading 

The standard references for radiometric terms and units in the United States are 
those put forth by the American National Standards Institute (ANSI) [432] and the 
Illumination Engineering Society of North America (IES) [221]. I have followed their 
conventions and notation here. 

A thorough discussion of reflection is given in the report by Nicodemus et al. 
[318], who also include a number of alternative naming systems for reflection func¬ 
tions. 

Radiometry for computer graphics is discussed by Kajiya [235] and by Shirley 
[402]. A great deal of practical information for real light sources and principles for 
lighting design may be found in the early textbook by Moon [312]. 


13.11 Exorcises 

IxorriM 13.1 

(a) Use Equation 13.1 to form a double integral expressing the surface area of 
the sphere, and evaluate the integral. 

(b) Use Equation 13.3 to form a double integral expressing the total projected 
area of the sphere onto the Z plane. Use domains 0 € (0, n) and € (0, 27 r). 
Briefly explain your result. 

(c) Repeat (b) with domains 0 e (0, n/2) and ^ € (0, 2n). 

IxorciM 13.3 

The BRDF for perfect specular reflection in Equation 13.70 is in a form given by 
Hanrahan in [188]. Nicodemus et al. in [318] give the following, different form: 

f r (0i -» 6 r ) = 25(sin 2 0* - sin 2 0 r )[5(^ r - + *) + - ipi - 7r)] (13.78) 

Derive this form from the definition of the BRDF in Equation 13.39. (Hint: use the 
substitution v = sin 2 0.) 

IxtrdM 13.3 

A man walked up to the complaints department in a hardware store and stated that 
he bought two bulbs in that store the previous day. Each bulb had its radiant power 
in watts printed right on the top of the glass. He bought a 200-watt bulb for reading 
and a 400-watt bulb for his workshop. He found that the 200-watt bulb was too 
bright even for the workshop and, surprisingly, the 400-watt bulb was so dim he 
could hardly see by its light in a dark room. Assuming that the bulbs were correctly 
labeled and working properly, suggest a reason for these observations. 
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lx«rciM 13.4 

Consider a conical beam starting a point P with circular cross section and apex angle 
of 6. Choose two cross sections at distance d\ and c ?2 from P, with radii r\ and r 2 , 
respectively. Assume the flux in the beam is $. 

(a) Compute the difference in radiance L 2 — L\ between these cross sections. 

(b) When does L 2 — L 1 } 

IxtrdM 13.5 

Show that the intensity I may be interpreted as the flux per unit area at a distance 1 
from a point source. 

Ex«rciM 13.6 

Find an expression for the flux 3> from a differential source 5 to a large, rectangular 
patch dR . 

IxtrciM 13.7 

Find an expression for the flux $ from a large finite circular source 5 to a differential 
patch dR. 




There is an exhilaration about setting up with 
new materials. It promotes energy, promises a 
fresh start, and inspires new ideas. 

Bet Rorgeson 

(“The Colored Pencil,” 1983) 
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14.1 Introduction 

An important issue in image synthesis is why objects appear the way they do. We 
only see objects because they emit photons. Each emission may be characterized 
as either spontaneous or responsive . A spontaneous emission is generated by the 
material itself in response to its own internal processes. A responsive emission is 
triggered by an external event, such as the arrival of a beam of particles, physical 
friction, or agitation. 

Both types of emissions are due to the action of subatomic particles within the 
material. In order to discuss how materials absorb and emit energy, we will first 
briefly survey some aspects of the nature of atoms and molecules. In Chapter 15, we 
will consider the large-scale effect of many atomic and molecular interactions with 
light when we discuss shading. But to understand how such large-scale phenomena 
arise, it helps to have a basic working knowledge of the underlying physics. 

We will develop some of this structural information in detail, particularly by 
deriving the statistical distribution of electrons and photons in materials. These 
developments are self-contained and not required for understanding anything else 
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(a) The classical atom model, (b) A more modern view. 


in the book, and may be safely skipped on a first reading. Each of these sections 
(14.3.1 and 14.6.1) is identified in the first paragraph or two. 


14*2 Atomic Structure 

The classical model of the atom posits a dense nucleus of small particles called protons 
and neutrons, which are orbited by a number of electrons. Modern quantum- 
mechanical views of the atom replace these notions of particles and locations with 
probability functions. This is based on the idea that any given particle cannot be 
located with precision, but that we can assign a probability of finding the particle in 
a given place at a given time. When the probability function is large, the particle is 
likely to be in that place and time. The notion of little billiard balls orbiting around 
a central clump of balls is then replaced by a cloud of electrons, where the cloud 
density corresponds to the value of the probability function These two pictures 
are contrasted in Figure 14.1. 

The center of an atom is called the nucleus . A nucleus contains two types of 
particles (called nucleons): the proton and the neutron. The proton is a massive 
charged particle that is arbitrarily assigned positive charge +e. The neutron is 
almost identical to the proton except it has no charge. Protons and neutrons exert 
a number of attractive and repulsive forces upon each other, which cause them to 
aggregate and form a nucleus. 

The total number of protons in an atom is called the atom’s atomic number, and 
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is designated Z. A nucleus by itself has an electric charge of 4- Z. Typically there are 
an equal number of neutrons and protons in the nucleus; when these quantities are 
different, the atom is called an isotope. 

An electron is a particle that is three orders of magnitude lighter than the nucleons, 
and has an electric charge — e equal in magnitude but opposite in sign to that of the 
proton. The complete assembly of nucleus and associated electrons is called an atom. 

In an electrically neutral atom, Z electrons surround the nucleus, balancing out 
the nucleus’s excess positive charge. When the charge is not 0, the atom is called an 
ion. When there are too few electrons, the atom has a positive charge and is called 
a cation. When there are too many electrons around a nucleus, there is an excessive 
negative charge; such an atom is called an anion. 

When an electron is within an atom, it is said to be bound. An electron can also 
exist on its own, independent of any particular atom; this is called a free electron. 

Nucleons are relatively stable particles and don’t typically leave the nucleus or 
change their internal state under normal conditions. But electrons are very suscepti¬ 
ble to external influences, and the way electrons behave is primarily responsible for 
most of the chemical events that make life possible. Electrons are also responsible 
for the emission of light by solids. 

Electrons are typically in either a stable or unstable state. A stable state is one in 
which the electron can exist for a relatively long period of time without change. An 
unstable state typically has a much shorter lifetime. As an analogy to these states, 
think of a pencil with a sharp point. If you lay the pencil down on its side on a 
flat table, it’s in a stable state: until something interferes with the system, the pencil 
won’t go anywhere. Now imagine balancing the pencil on its point. You may be 
able to get it to hold there for a moment, but normal physical pressures (e.g., wind 
in the room and vibration of the table) will eventually cause the pencil to fall over. 
The condition of being balanced on its point is an unstable state for the pencil: it 
can hold it for a while, but not very long compared to the stable states. 

The state of an electron at any time is given by a set of four quantum numbers. 
These characterize the electron’s energy, momentum, and “spin” (like the other 
terms, spin is an abstract concept, but it is sometimes convenient to think of a small 
ball spinning either clockwise or counterclockwise around an axis). One of the great 
achievements of quantum mechanics was to explain how the periodic table of the 
elements is constructed in terms of the atomic number. Basically, electrons join the 
system in a very well-defined way based on their quantum numbers, which we will 
now summarize [267]. 

The principal quantum number , n, describes the energy of a bound electron. 
The value of n is drawn from the range of positive integers: n € {1,2,3,...}. The 
angular-momentum quantum number , /, describes the possible angular momenta of 
the electron about the nucleus. As the energy of the electron goes up (that is, as n 
increases), there are more possible angular-momentum states. We capture this fact 
by drawing the value of l from the range l e {0,1,2,..., n — 1}. The magnetic - 
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PIOURI 14*2 

The first few rows of the periodic table of the elements. 


moment quantum number , describes the projection of the angular momentum 
of an electron given by n and l onto an axis defined by an external magnetic field. Its 
values are € {— /, — l + 1,..., 0,..., l — 1, / }. Finally, the spin-moment quantum 
number , s M , describes the spin of the electron onto this axis; only takes on the 
values € {-1/2,1/2}. 

We can summarize these rules with a general quantum state rule for electrons: 

= |m M | < l < n = 0,1,2,3,... (14.1) 

In spectroscopy, the values of n = 1,2,3,... are denoted K,L*Af,N, and the 
values for l = 0,1,2,3,4,5,6 are denoted s,p, d, f,g,h and are called orbitals , or 
shells . The first four letters stand for sharp , principal , diffuse , and fundamental , with 
following letters simply coming alphabetically [295], These historical names come 
from the spectral lines observed for atomic sodium. The total angular momentum 
of all electrons in an atom is assigned the total angular quantum number , L, which 
takes on values 0,1,2,3,4,5 (written 5, P, D , F, G , tf). 

The top few levels of the periodic table of the elements are shown in Figure 14.2 
(Table E in Appendix E is the full table). In this table, the horizontal rows are called 
periods and the columns are called groups . We can use the rule of Equation 14.1 
to build the first few elements. In general, we add electrons to a system in a way 
very similar to counting in binary: the spin quantum number is the least-significant 
bit, and the principal quantum number is the most-significant bit. We start with 
every quantum number at its lowest admissible value and work our way up. We 
continue to add electrons until we have Z of them, balancing the charge in the 
nucleus. Table 14.1 summarizes the results we will find, and Figure 14.3 shows how 
the same quantum numbers describe the first few elements. 

One might ask why we should bother changing the quantum numbers at all when 
accumulating electrons; that is, why not simply use lots of ^electrons? Experiment 
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FIOURI 14.3 

The quantum-mechanical distribution of the first few elements. 


has shown that electrons in the same system (in our case, the same atom) never share 
the quantum numbers. This is known as the Pauli exclusion principle , and it tells us 
that if two electrons coexist in the same atom, they cannot share the identical set of 
quantum numbers (actually, this principle holds for all fermions , which is a class of 
particles that includes electrons [427]). 

We begin by assigning the smallest value to the principal quantum number: n = 1. 
We can only assign l = 0 and m^ = 0. We now have our choice of spin. We will write 

= ±1/2 to indicate that this quantum number can have either value (but only 
one value at a time, of course). This set of four quantum numbers, {n, /, m^, s^} = 
{1,0,0, ±1/2} completely defines one electron (except for the ambiguity in the spin). 
This describes the electron associated with hydrogen, atomic number Z — 1, and the 
first element on the periodic table. It is common to write the electron configuration 
by stating the principal quantum number, the letter for the orbital, and a superscript 
identifying the number of electrons in the orbital. Thus the electron configuration 
for hydrogen, n = 1, l = 0 = 5, ljjwrit^^ 

The next element is helium, atomic number Z — 2. The second electron is 
built by fixing a value of s^ for the first electron, and giving the other value to the 
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Symbol 

Name 

Relative configuration 

Full electron configuration 

1 

H 

Hydrogen 

Is 

Is 

2 

He 

Helium 

Is 2 

Is 2 

3 

Li 

Lithium 

[He]2s 

Is 2 ,2s 

4 

Be 

Beryllium 

[He]2s 2 

Is 2 ,2s 2 

5 

B 

Boron 

[He)2s 2 ,2p 

Is 2 ,2s 2 ,2p 

6 

C 

Carbon 

[He]2s 2 ,2p 2 

Is 2 ,2s 2 ,2p 2 

7 

N 

Nitrogen 

[Hel2s 2 ,2p 3 

Is 2 ,2s 2 ,2p 3 

8 

O 

Oxygen 

[He]2s 2 ,2p 4 

Is 2 ,2s 2 ,2p 4 

9 

F 

Fluorine 

[Hel2s 2 ,2p 5 

Is 2 ,2s 2 ,2p 5 

10 

Ne 

Neon 

[He)2s 2 ,2p 6 

Is 2 ,2s 2 ,2p 6 

11 

Na 

Sodium 

[Ne]3s 

Is 2 ,2s 2 ,2p 6 ,3s 

12 

Mg 

Magnesium 

|Nel3s 2 

Is 2 , 2s 2 , 2p 6 ,3s 2 

13 

A1 

Aluminum 

[Ne)3s 2 ,3p 

Is 2 ,2s 2 ,2p 6 ,3s 2 ,3p 

14 

Si 

Silicon 

[Ne)3s 2 ,3p 2 

Is 2 ,2s 2 ,2p 6 ,3s 2 ,3p 2 

15 

P 

Phosphorus 

[Ne)3s 2 ,3p 3 

1s 2 ,2s 2 ,2p 6 ,3s 2 ,3p 3 

16 

S 

Sulphur 

[Ne]3s 2 ,3p 4 

Is 2 ,2s 2 ,2p 6 ,3s 2 ,3p 4 

17 

Cl 

Chlorine 

[Nel3s 2 ,3p 5 

Is 2 ,2s 2 ,2p 6 ,3s 2 ,3p 5 

18 

Ar 

Argon 

[Ne)3s 2 ,3p 6 

1s 2 , 2s 2 ,2p 6 ,3s 2 ,3p 6 

19 

K 

Potassium 

[Ar]4s 

Is 2 ,2s 2 ,2p 6 ,3s 2 ,3p 6 ,4s 

20 

Ca 

Calcium 

[Ar]4s 2 

Is 2 , 2s 2 , 2p 6 , 3s 2 ,3p 6 ,4s 2 

21 

Sc 

Scandium 

[Ar]4s 2 ,3d 

Is 2 ,2s 2 ,2p 6 ,3s 2 ,3p 6 ,4s 2 ,3d 


TABLE 14.1 

Building up elements by quantum rules. 


second. Thus the two electrons in helium have quantum numbers {1,0,0,1/2} and 
{1,0,0,-1/2}, together written Is 2 . 

We have now exhausted the possibilities for n = 1, so we set n = 2 and start again 
with l = 0; this again forces = 0. So we add one electron with arbitrary spin 
on top of the already full Is shell to make lithium (L), with configuration written 
Is 2 ,2s. As before, we can now add another electron, so we have one of each spin, 
to make beryllium (Be), atomic number Z = 4, Is 2 ,2s 2 . 

To continue, we notice that / = 1 is now a permissible state. In general, Equa- 
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tion 14.1 tells us there will be 2(2 1 + 1) electrons associated with any given choice of 
n and Z; the value of 2(2Z + 1) is called the degeneracy of the state. 

We start with m^ = —1 and arbitrarily set s^ = —1/2 to make boron (B), in state 
{2,1, —1, -1/2}. Here we see our first subtlety: electrons filling up this orbital go 
in with parallel spins in order to maximize their mutual repulsion. So from boron 
(Is 2 ,2s 2 ,2p), the next electron goes in with a new magnetic moment m M = 0 and the 
same spin s M = —1/2, to make carbon (C), in state {2,1,0, —1/2}. We write this as 
Is 2 ,2s 2 ,2p 2 . Similarly, the third electron goes in with magnetic moment = 1 and 
spin s^ = —1/2, to make nitrogen (N), in state {2,1,1, -1/2}. Now we return to the 
other spins, sequentially adding in electrons in states {2,1, —1,1/2}, {2,1,0,1/2}, 
{2,1,1,1/2}, making sequentially oxygen (O), fluorine (F), and neon (Ne), finally 
ending up with configuration Is 2 ,2s 2 ,2p 6 . 

Due to internal energy effects, the orbitals don’t fill in exactly sequential order as 
we might assume from the above. The order is Is, 2s, 2 p, 3s, 3 p, 4s, 3d, 4 p, 5s, 4d, 5 p, 
6s, 4/, 5d, 6p, 7s, 5/, 6d. Note, for example, the transition from calcium (Z = 20) to 
scandium (Z = 21) in Table 14.1. When an atom’s outermost shell is full, the atom 
is particularly stable; examples include helium and neon, which form compounds 
with less readiness than other elements such as carbon and nitrogen. 

When there is no magnetic field applied to an atom, the different states distin¬ 
guished by the quantum number ra M degenerate into a single state. The spacing 
between these different levels, when they occur, is typically very small. When a 
single state splits into three (i.e., when l = 1), the resulting set of states is called 
a triplet . The occurrence of multiplets of any order when an atom is placed in an 
external magnetic field is known as the Zeeman effect . 

The structure of these orbitals influences how atoms link up into molecules. Some 
orbital diagrams are shown in Figure 14.4. In these figures, the density of the dots at 
any position indicates the likelihood that the electron will be found at that position; 
higher density corresponds to higher probability. 

All of the s orbitals are spherically symmetric; the others are not so simple [295]. 
Figure 14.5 shows the three p orbitals, p x , p y , and p z . Actually, these pictures are 
just the angular parts of the orbital definitions, but they suggest the asymmetry of 
the orbital’s structure. 

The quantum numbers associated with an electron also determine its energy. In 
general, as the principal quantum number n goes up, so does the energy associated 
with the electron. Figure 14.6 shows some of the energy values associated with the 
sodium atom. The labels on each line are the wavelengths in angstroms (A) of the 
transitions. The vertical lines represent allowable electronic transitions by which an 
electron can gain or lose energy to change states. 

In addition to the Z ground states normally inhabited by the electrons of an elec¬ 
trically neutral atom, there are many higher-order excited-state energy levels . Above 





MOURI 14.4 

Probability density plots for some electron orbitals in the hydrogen atom. Redrawn from McQuar- 
rie, Quantum Chemistry , fig. 6-12, p. 232. 



MOURI 14.S 

The p orbitals for / = 1. Redrawn from McQuarrie, Quantum Chemistry , fig. 6-11, p. 232. 













PIOURI 14.6 

Some allowed energy levels in the sodium atom. Redrawn from McQuarrie, Quantum Chemistry , 
fig. 8-5, p. 325. 

these levels lies the ionization continuum , where electrons become disassociated from 
particular atoms and are free to move away. 

The essential point behind Figure 14.6, and the reason this information is useful 
to us in the first place, is that electrons move between states by absorbing and 
releasing energy, often in the form of photons. Consider the transition between the 

# o 

3 p and 4 d states. From the diagram, we see that this corresponds to 5688.2 A. Since 
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the energy E of a photon of wavelength A is given by E = /icA, this corresponds to 
about 1.13 x 10 -31 J. If a photon of this energy arrives at a sodium atom, where there 
is an electron in the 3p state and an opening in the 4 d state and other conditions are 
right, then the atom will absorb the photon. 

In general, absorption means that the photon disappears completely (recall that 
a photon cannot exist at rest, and cannot transfer only some of its energy), and the 
atom is now in an excited state. The incoming photon is called an excitant. 

The excited atom may be stable or unstable, depending on its internal structure. 
Most unstable states have a lifetime no longer than 10 -8 seconds. By that time, the 
electron will undergo a radiative transition by dropping back to the ground state 
and emitting a new photon to carry away the difference in energy between the two 
states. 

Absorption and radiation have been found to obey certain transition rules that 
specify which electron transitions may happen. These are based on allowable changes 
in the quantum numbers of the electron and the system. For example, one rule states 
that the total angular-momentum quantum number L may change by only -1, 0, 
or -hi, but in addition the system may never have L = 0 both before and after the 
transition [295]. 

There are many ways for an electron to absorb energy and reradiate it into the 
surrounding system. Suppose an electron in a ground state is excited by a photon, 
and then before it drops back down it is excited again by another photon, raising 
the electron to yet a higher energy level. If the electron finally drops down to the 
ground state in one step, it will emit a photon with more energy than either absorbed 
photon. Even when only one photon is absorbed, typically the radiated photon has 
less energy. The energy donated to a system by an excitant, and then left in the 
system after the emission of a photon, is called the energy deficit . When an atom 
responds to an excitant photon by emitting a photon of just the same energy, this 
is called resonance radiation. The ratio of the number of photons emitted for each 
photon absorbed, multiplied by 100, is called the quantum efficiency of the atom. 


14.3 Particle Statisti cs 

Large numbers of particles tend to distribute themselves among allowable config¬ 
urations in predictable ways. This distribution may be characterized by different 
statistical measures that tell us how many particles we can expect for different 
ranges of quantum numbers. 

The general idea is to consider N particles in s different states and ask for the 
most likely distribution of those particles among those states. For example, consider 
a solid that has been heated. The additional heat turns into kinetic energy among 
the electrons in the solid, and some of these electrons are in excited states. We would 
like to know how the electrons distribute themselves among these states. A similar 
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question may be asked of the photons emitted by a material that has been exposed 
to heat or radiation. 

The distribution of particles within a material is important to image synthesis 
because it sets up the phenomena that govern the way the material interacts with 
light. The appearance (or looks) of a material is dependent on many factors (coating, 
smoothness, and so on), but the physical properties of the material itself are always 
relevant. Understanding the distribution of energy states in a material gives us a 
handle on how the material will interact with light. Similarly, understanding the dis¬ 
tribution of generated photons inside a material tells us something about the light that 
will be radiated by that material. Although we usually don’t implement simulations 
of energy transfer at the subatomic level, it’s useful to have a general understanding 
of the phenomena that we model in the aggregate with shading techniques, such as 
those in Chapter 15. 

When there are a large number of particles in a system, the theoretical prediction 
for their distribution among the possible energy states is extremely close to what is 
actually measured [148]. To find this distribution, we will calculate W , the number 
of ways that the particles may be allocated for a particular distribution, and then 
find the distribution for which W is a maximum. This is the most likely distribution 
of particles. The distribution of electrons in particular is governed by Fermi-Dirac 
statistics. 


14*3* 1 Ptmi-Dirac Statistics 

Electrons are members of a class of subatomic particles known as fermions , which are 
characterized as being noninteracting and indistinguishable [465]. Their distribution 
is given by Fermi-Dirac statistics. We derive those statistics in this section. The 
information here may be skipped on a first reading of the book, since it is not 
essential to later material. 

To develop the Fermi-Dirac statistics, we will follow the presentation in Longini 
[276]. We begin with Figure 14.7, which shows a pair of electron transitions between 
energies E\ -* E 2 and £4 -* £ 3 , where 

E l -E 2 = E 3 -E 4 (14.2) 

These two events can only occur if energy levels E\ and £4 are occupied, and 
according to the Pauli exclusion principle, levels E 2 and £3 must be empty. Write 
Pi for the probability that energy level E{ is occupied; this is called the occupancy 
probability . Then the probability P that the events in Figure 14.7 can occur is 
given by the product of the probability that £1 and E 4 are occupied (p\ and P 4 , 
respectively), and E 2 and £3 are empty (1 - p 2 and 1 - p 3 , respectively). Combined, 
these form the probability 


P = PiPa{1 -P 2 KI -Ps)E 


(14.3) 



FI G U It I 14.7 

A pair of electronic transitions. 


where we have included a quantum-mechanical electronic interaction factor , F. 

We can state another condition on the system by using the principle of detailed 
balancing . This quantum-mechanical principle includes some of the notion that in 
quantum mechanics there is no preferred direction for time; reactions can occur in 
either direction. The principle of detailed balancing says that in thermal equilibrium, 
every physical process proceeds on the average at the same rate as its own inverse 
[276]. This means that the reaction has the same probability of running in the 
opposite direction, with the appropriate probabilities exchanged: 

P = p 2 p 3 (l- Pl )(l-p 4 )F (14.4) 

where we have used the same factor F . 

Since both Equations 14.3 and 14.4 express P, we can combine them to find 

PlP4(l - P2)(l - P3) = P2P3(1 - P l)(l - Pa) (14.5) 

Dividing through by P 2 P 4 U — Pi){p — P 3 )> we find 


Pi 1 - P 2 __ P3 l- Pa 
1 - Pi P 2 1 ~P3 Pa 


(14.6) 


Now from our assumption in Equation 14.2, the left-hand side of Equation 14.6 
depends only on the difference of energies E 2 - E\\ similarly, the right-hand side 
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E 2 


A £ 


d£. 


£ 1 


A£ 


dE x 


PI O U It I 14.8 

Energy states dE\ and d£ 2 . 


depends only on the difference £3 - £ 4 . To make this product mimic the difference 
it models, we recast each term as a logarithm: 


G 1 = In 


fi 

l-/i 


G 2 = In 


Si 

1 - /2 


(14.7) 


Consider now two new states, dE\ = E\ + A E and d £ 2 = £2 + A£, as shown 
in Figure 14.8. Since G\ - G 2 depends only on E\ - £ 2 , 


(Gj + AG) - (G 2 + AG) = (£1 + A£) - (£ 2 4- A£) 

Writing dG for AG and dE for A£, we set 

dGi _ dG 2 _ 

d£i ~ 1 dE~ 2 ~~ 2 


(14.8) 


(14.9) 


So then 


0 = dGi = dG 2 = -Li d£i 4- L 2 d £ 2 = (L 2 - Li) dE x (14.10) 

Since d£i / 0, then L — 2 — L\ = 0, or L 2 = Li. But since the energies £ were 
arbitrary, the values of L must be independent of £. 

Suppose we pick a reference energy £p where Gf = 0 . This value is called the 
Fmw/ level of energy; it’s the energy where the occupancy probability is 1 / 2 . Then 
from Equation 14.9 we can write 

rG\ pE\ 

/ dG = -L 

J G F j E F 


dE 


(14.11) 
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Integrating, we find 


G\ = L(E f - E\) 


Equating this with the definition of G\ from Equation 14.7, we find 


In 


h 

l-/i 


= L(E F -E l ) 


which we can solve for /i, finding 


(14.12) 

(14.13) 


fi = 


_ 1 _ 

1 + exp [L(Ei - E f )\ 


(14.14) 


Equation 14.14 is called the occupancy equation for energy level E\. Note that 
fi{E F ) = 1/2. 

Now that we know the likelihood of an electron to be at energy level E\ 9 we 
can find the total energy U in the system by summing the products of the number of 
electrons /, at each level i with the energy Ei at that level: 


u = j2 E ifi 

i 


(14.15) 


Comparing this result to experiments, we find thatXcorresponds to fcT, where k 
is Boltzmann’s constant (given in Table E.3). Then we can write the occupancy 
equation as 

^ = 1 + exp(£i - E F )/kT (14.16) 

which is known as the Fermi-Dirac distribution . This distribution is plotted in 
Figure 14.9. 

Note that at absolute zero (that is, kT = 0), 


f 0 E > E f 
\ 1 E < E f 


(14.17) 


This tells us that at absolute zero, all of the electron orbitals below the Fermi energy 
are filled, and all of the orbitals above it are empty. As energy is introduced into the 
system, electrons move out of the lower orbitals into the more excited, higher-energy 
cells. 


14.4 Molecular Structure 

Understanding the structure of molecules will help us understand some of the ag¬ 
gregate properties of matter, which influence how matter generates energy (or emits 
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FI G U It I 14.9 

The Fermi-Dirac distribution. Redrawn from Wang, Introduction to Solid State Electronics, 
fig. 3.5.1, p. 38. 


light) and responds to incident energy (or reflects and transmits light). Though we 
will not be analyzing molecular structures in this book, much of the literature on the 
appearance and behavior of solids supposes a basic knowledge of molecular struc¬ 
ture. That literature is invaluable when writing rendering programs that include 
real materials. To make that literature accessible, we include here a short survey 
of molecular structure and bonding. Much of this section is based on material in 
McQuarrie [295]. 

A molecule is an electrically neutral, stable combination of two or more atoms 
[267]. These atoms may all be of the same element, or a variety of atoms may be 
mixed together. Atoms are typically held together by bonds that form between their 
outermost, or valence , electrons. 

The simplest molecule is probably that formed by joining two hydrogen atoms; 
it is denoted H 2 (in general, a molecule is named by listing its component elements 
with a subscript indicating the number of atoms of each type). 

There are two general classes of bonds: ionic (or polar) and molecular-orbital (or 
covalent ). 


14.4.1 IgiiIc Bonds 

Ionic bonding is rather straightforward from a macroscopic viewpoint; this is the 
sort of bond that forms between two ions, or electrically charged atoms [267]. For 
example, from Table 14.1 we see that potassium (K) has an atomic structure with a 
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single electron in its outermost shell 4s. If this single outermost (or valence) electron 
were to be lost by the atom, the atom would become an ion, or a charged atom, with 
a single unit of excess positive charge. This is written K+. 

Similarly, chlorine (Cl) has an outermost shell of 3p 5 . Since the p shell can contain 
up to six electrons, it’s conceivable that chlorine could pick up an extra electron to 
complete the shell (this electron could come from some other atom in a solid or 
crystal). Then the outermost shell would be 3p 6 , and the atom would become an ion 
with a single excessive negative charge, written Cl - . 

If we bring these two ions together, the equal but opposite excess charges will 
neutralize each other, resulting in an electrically neutral molecule: KC1. 


14.4.2 M^I+cular-Ovfaifal Bonds 

Molecular-orbital bonds are not as simple as ionic bonds. We will start with the 
molecule composed of two hydrogen atoms: H 2 . 

Recall that the shape of an electron orbital is given by a probability function; 
the larger the function’s value at some point, the more likely the electron is to be 
there. Suppose we consider the Is orbitals for two hydrogen atoms, initially very far 
apart but then brought together. At some distance, the probability fields will begin 
to overlap, and it will become increasingly likely that the electron will be found at 
some point in a region near a line between the two nuclei. This will cause the total 
energy in the system to decrease as the atoms approach each other. When the atoms 
get sufficiently close together, the nuclei will begin to repel one another, sending the 
energy of the system back up. This dependence of internuclear potential energy on 
distance is plotted in Figure 14.10. The label A E+ shows the energy of the combined 
system relative to that of two independent hydrogen atoms. 

The figure shows that there is some point at which the energy is a minimum, and 
that this is less than the energy of the two atoms at a great distance; the H 2 atom is 
in a stable state. 

The general idea here is that we can describe the orbitals in a molecule by con¬ 
sidering the individual orbitals of the atoms. In fact, we can describe the orbitals 
of electrons in the molecule as linear combinations of the orbitals of electrons in 
the component atoms; this is called the molecular-orbital method. Mathematically, 
we sometimes build molecular orbitals from products of linear combinations of 
atomic orbitals; this is called the LCAO-MO (linear combination of atomic orbitals- 
molecular orbital method). 

The mathematics predicts two types of orbitals into which electrons can fit: 
bonding orbitals that represent an attraction between the two nuclei, and antibonding 
orbitals that represent a repulsive force between the nuclei. Typically we subscript 
an orbital with the letter a or 6 to identify whether it is antibonding or bonding. The 
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PIOURI 14.10 

The internuclear potential energy curves of H 2 . Redrawn from McQuarrie, Quantum Chemistry , 
fig. 9-5, p. 352. 


bond order , or number of bonds in a molecule, is given by 

/ number of electrons \ f number of electrons \ 

, , , l in bonding orbitals I l in antibonding orbitals ) 

bond order = ^(14.18) 

Suppose we have two identical hydrogen atoms. Each has an identical single 
electron, with a wave function ip defined everywhere in space (the squared value of 
this complex-valued wave function, ipip, is the probability of finding the electron at 
the place and time the function is evaluated). For atoms A and B, we’ll call the wave 
functions for their electrons 1 sa and Is#? respectively, reminding us that each one 
represents an electron in the Is orbital centered around its respective nucleus. 
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FIGURE 14.11 

Linear combinations of two Is orbitals. Redrawn from McQuarrie, Quantum Chemistry , fig. 9-8, 
p. 370. 


In the LCAO-MO approach, molecular orbitals are formed from linear combi¬ 
nations of atomic orbitals. For H 2 , we’ll write the the possible wave functions 


= Is a -F Is# 

ip- = 1 s A - 1 s B (14.19) 


These two orbitals are shown in Figure 14.11. 

The bonding orbital V>+ concentrates electron density between the nuclei. The 
antibonding orbital ip- makes the region between the nuclei sparse; in fact there’s a 
plane between the nuclei perpendicular to the axis between them (the nodal plane) 
where the electron density falls to zero. 

Note that the electron density in these orbitals is symmetric about the axis between 
the nuclei, like s orbitals. They have been given the similar name a orbitals. Because 
these particular molecular orbitals derive from Is atomic orbitals, they are called 
the < 7 Is orbitals. There are two common notational conventions for distinguishing 
the bonding and antibonding forms of these orbitals [295]. One approach writes 
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FIOURI 14.12 

The a2p z and a*2p z molecular orbitals. Redrawn from McQuarrie, Quantum Chemistry , fig. 9-9, 
p. 371. 


the bonding form as simply als , and the antibonding form as a* Is. The other 
approach thinks of the wave function as being either even (or symmetrical) or odd 
(or antisymmetrical) about the midpoint between the nuclei, just as cosine and sine 
are even and odd about the origin. The German word for “even” is gerade , so the 
bonding (even) orbital t/>+ is sometimes written with the subscript < 7 , as in cr 9 ls. 
The German word for “odd” is ungerade , so the antibonding (odd) orbital is 
sometimes written with the subscript u, as in cj u \s. 

We can continue building molecular orbitals from combinations of atomic or¬ 
bitals. Usually, only orbitals of similar energies combine, so we can focus our 
attention on like or nearby combinations of orbitals. The orbitals built from a pair 
of 2 s electrons would be written <j 9 2s and <r u 2s. 

Moving to higher energies, we can form combinations of p orbitals. The p orbitals 
are not radially symmetric; we can identify the three p orbitals, each one looking 
like a pair of spheres about the origin, located along the x, y , and 2 axes as in 
Figure 14.5. Suppose that we place the two hydrogen nuclei some distance apart 
along the z axis. Adding the 2 p z orbitals for the two atoms gives us again a bonding 
and an antibonding set of orbitals, as shown in Figure 14.12. Because they are 
symmetric about the internuclear axis, they are classified as <7-type orbitals: 


o2p z = 2 p zA -I- 2 s zB 

<r*2p z = 2p zA + 2s zB 

On the other hand, the 2 p x and 2 p y orbits are not symmetrical about the inter¬ 
nuclear axis. In fact, the yz plane forms a nodal plane for the p x orbitals, and the 
xz plane is nodal for the p y orbitals, as shown in Figure 14.13. 










700 


14 MATERIALS 



FI O U It I 14.13 

The 7r2 p x and n*2p x molecular orbitals. Redrawn from McQuarrie, Quantum Chemistry , fig. 9- 
10, p. 372. 


Within the atom, an orbital with one nodal plane is called a p orbital, so in the 
molecular case such orbitals are called i r orbitals. The bonding 7 t2 p x orbital has one 
nodal plane; the antibonding tt* 2p x has two nodal planes. 

To see how this all works out, we will compute the bond order for H 2 and He 2 . 
We will fill electrons in the system according to the Pauli exclusion principle. In H 2 
we have two electrons, so we place one electron of each spin into the a Is orbital, 
and the resulting configuration of H 2 is (oTs) 2 . The bond order is (2 - 0)/2 = 1, 
which suggests that there is some net bonding force keeping the atoms together. 

Now turn to diatomic helium He 2 , which has two electrons in each atom for 
a total of four. Two of these electrons go into the a Is orbital, and the other two 
go into the a*Is orbital. The resulting configuration is ( 0 Ts) 2 (<t* 1 s) 2 , with a bond 
order of (2 - 2) = 0. The theory says that there is no net force keeping the atoms 
together, so this molecule ought not to form. In accordance with this predication, 
He 2 has never been experimentally observed [295]. 

The construction of molecular orbitals from electron orbitals can become more 
complex than the simple examples presented above. As a first example, s and p 
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orbitals can overlap. This is the case in water (H 2 O). The oxygen molecule has 
an electron configuration of ls 2 2s 2 2p x 2 2p y l 2p z l (recall that the p shell fills up so 
that electrons are as far apart as possible). The unpaired 2p y and 2 p z electrons are 
available to the Is electron in the hydrogen for bonds. We would expect a 90° angle 
between the two hydrogen molecules, but experiment gives a value of 104°. This 
is because we’re leaving out the mutual ionic repulsion between the two hydrogen 
atoms, forcing them apart. Including this term brings us closer to 104°. 

The full power of the LCAO-MO approach appears when we build up linear 
combinations of electron orbitals to create new molecular orbitals. 

For example, consider the molecule beryllium hydride, BeH 2 . There are two 
Be-H bonds in this molecule, at an angle of 180°. The ground state of beryllium is 
1s 2 2s 2 ; where could these two hydrogen bonds come from in such a configuration? 
The answer comes from creating a linear combination of berylliums’s 2 s and 2 p z 
orbitals, creating a new hybrid orbital , called sp. The bonds forming from these 
orbitals are called bond orbitals , made of a combination of the newly created sp 
orbitals on beryllium and the Is orbitals on hydrogen. Figure 14.14 shows the 
contours associated with one sp orbital; since 2 p z has two lobes, there’s another 
orbital just like this at 180° from this one. The complete outer orbital picture for the 
beryllium atom is shown in Figure 14.15; note that although the 2 p x and 2 p y orbitals 
are shown for clarity, they are unoccupied. Finally, the mating of a hydrogen atom 
with these orbitals is shown in Figure 14.16. 

The energy of this hybrid orbital is different from the energy associated with either 
the 2 s or 2 p electron orbitals. This changes the energy transitions that are available 
to the electrons in the molecule, and hence how that molecule (and substances made 
of it) will respond to incident energy in the form of light. 

This process can be repeated to form an sp 2 hybrid orbital, which forms a 
molecule such as BH 3 . 

Carrying it one more step, we can consider forming a compound out of carbon. 
Compounds containing carbon are called organic molecules , because of the impor¬ 
tance of carbon to life. Molecules that do not contain carbon are called inorganic. 
The molecule methane, CH 4 , is built from one carbon atom and four hydrogen 
atoms. The molecule forms a regular tetrahedron with the carbon atom at the 
center. 

The carbon forms sp 3 hybrid orbitals, whose contours are shown in Figure 14.17. 
The resulting sp 3 orbitals link up with hydrogen’s Is orbitals to form the tetrahedron 
that is methane, as shown in Figure 14.18. 

More complex hybrid orbitals and more complicated bonding structures can be 
developed from these principles [295], but these examples are sufficient for our pur¬ 
poses of illustrating the types of structures that form when atoms arrange themselves 
into molecules. 

The essential point is to notice that the basic energy transitions available to 
electrons change when those electrons are involved in a bonding process that brings 
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FIOURI 14.14 

Contour map for the sp hybrid orbital. Redrawn from McQuarrie, Quantum Chemistry , fig. 9-18, 
p. 400. 



FIOURI 14.19 

The sp orbitals along p x . Redrawn from McQuarrie, Quantum Chemistry, fig. 9-19, p. 400. 
















neuRi 14.16 

The formation of Behh. The arrows indicate coupled spins. Redrawn from McQuarrie, Quantum 
Chemistry, fig. 9-20, p. 401. 



MOIIII 14.17 

An electron-density contour map of an sp 3 hybrid orbital. Redrawn from McQuarrie, Quantum 
Chemistry , fig. 9-24, p. 406. 
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FIOURI 14.18 

The structure of methane, illustrating the tetrahedral arrangement of sp 3 hybrid orbitals. Redrawn 
from McQuarrie, Quantum Chemistry , fig. 9-26, p. 407. 


atoms together into molecules. In addition, molecules are large compared to atoms, 
and can contain significant translational and vibrational energy that also affects what 
energy is absorbed and emitted by a system of atoms. 


14.5 Radiation 

When we are able to see an object, it is because light is leaving it and arriving at our 
eye. Such light may be classified into two fundamentally different types: thermal 
and luminescent [267]. 

Thermal emissions are due to the object shedding excess heat energy in the form 
of light. An incandescent light bulb is a thermal radiator; an electric current is run 
through the filament to make it hot, and the filament gets rid of the heat by dispersing 
the energy through the emission of particles in a range of energy. The filaments in 
simple incandescent bulbs are chosen to maximize the number of particles emitted 
in the visible band. In a thermal radiator, the amount of light emitted is primarily 
dependent on the nature of the material and its temperature. 

Luminescent emission is due to energy stored (perhaps for a very short time) in the 
material, and is primarily due to factors other than temperature, though the temper¬ 
ature can affect the material. While thermal energy is generated by the object itself. 
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luminescent light is in response to light energy arriving from elsewhere. We have dis¬ 
cussed how an object responds to incident light in a macroscopic way in Chapter 13, 
where we characterized a material in terms of its bidirectional distribution function. 

In the following two sections we will look at thermal radiation and a particularly 
important class of luminescent materials, the phosphors . 


14.6 Blackbodios 

If we take a piece of almost any common metal and heat it up enough, it will start to 
glow red. If we raise the temperature still higher, the metal becomes white-hot. This 
simple observation suggests that there is a link between temperature and radiation, 
and in fact that is found to be the case. 

In general, at least some fraction of the energy emitted by some body at a given 
wavelength is a function of the material and the temperature. Using the second law 
of thermodynamics, we can predict the theoretical maximum amount of such light 
that can be radiated at a wavelength v given the temperature T for any material. 
We can posit an imaginary material that satisfies this maximum at every wavelength; 
such a body is called a blackbody. 

To find the energy radiated by a blackbody, we first need to find the likely 
distribution of photons for a system at a given state of energy. Just as electrons 
distribute themselves according to Fermi-Dirac statistics, photons follow a statistical 
law as well, called Bose-Einstein statistics, which we derive now. 


14.6.1 Boie-linstein Statistics 

To describe the distribution of photons, we turn to Bose-Einstein statistics , which are 
similar in spirit to the Fermi-Dirac statistics but different in detail. The information 
in this section may be skipped on a first reading of the book, since it is not essential 
to later material. 

Bose-Einstein statistics are appropriate for bosons , which are particles that are 
either spinless or have integral spins (recall that electrons have spins of ±1/2) [465]. 
Photons fall in this class because they have no spin. Photons also are not controlled 
by the Pauli exclusion principle, so multiple photons in a system can share the same 
quantum numbers. 

We will develop Bose-Einstein statistics following Fowles [148]. To begin, we 
will divide the range of energies that photons may take on into a number of states 
s. So we will identify each range by the subscript v, indicating the center frequency 
corresponding to that region. Within each state s u there are g v different quantum 
modes in which the photon can exist (these correspond to different internal states 
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The g u = 5 boxes induce g u — 1=4 walls for the n u photons. We mark the diagram with S for 
slot (or photon) and W for wall. 


of the particle: e.g., polarization). The number of photons n u in state i is called the 
occupation index or occupation number for that state. 

Consider just state i for a moment. It contains n u photons, distributed among 
g u different quantum modes. If we think of each mode as a box, then for a given 
distribution we want to know how many ways the n v photons may be allocated 
among the g u boxes. 

We can find the answer with a little pictorial construction. If there are g u boxes, 
then there are g u — 1 internal walls (walls between boxes). Suppose we make a 
picture containing n u + g u — 1 slots, as in Figure 14.19. Into each slot we can place 
either a photon or a wall. 

To intepret this picture, imagine that there is a horizontal row of g u boxes sepa¬ 
rated by g u - 1 walls, and in each box there may be 0 or more photons. We start 
at the left and count out the number of photons that we see, writing down an S 
(for slot) for each one. When we have placed an S for every photon we move to 
the right and encounter a wall, so we write W. Then we repeat the process; if there 
are no photons, then the next W follows immediately. Each pattern of S’s and W’s 
then describes one particular way in which the photons can be distributed among 
the boxes. We assume that there are walls at the far left and right ends of the boxes 
that don’t need to be explicitly counted. 

The total number of such pictures is the number of permutations of n u + g u - 1 
objects, or (n^+^-l)!. But recall that the photons are indistinguishable; this means 
that we can permute all the photons and see no difference. So we must divide the 
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number of pictures by the number of permutations of n v photons, or (n^)!. Similarly, 
the boxes are all the same, so we must divide by (g v — 1)!. The total number of ways 
W v to allocate n u photons into g v modes is therefore 


W„ = 


(jh, + gv - 1 )! 
n u \ ( 9u ~ 1 )! 


(14.21) 


When n„ is large, n v — 1 ~ n„, so we will drop the constant term 1 below for 
simplicity. 

The probability of the system being in some configuration W is then 


v v 


(ny + 9v)\ 

n v \ (g u )\ 


(14.22) 


To simplify this equation, we will replace the factorial with one of Stirling’s approx¬ 
imations, lnx! « x\nx - x, which is good for large x l [41,184]. So we can write 
the system probability W as 

In W = ^2 [(n u 4- g u ) Inin^ + g u ) ~ n v - g„ ln(^)] (14.23) 


It can be shown that if we differentiate Equation 14.23, we are at a maximum [148]. 
So differentiating and setting the derivative d(ln W) to 0, we find 

d(\nW) = [ln(7Z^ + g v ) - In n u \dn v = 0 (14.24) 


If each of the n v were independent, then each term in Equation 14.24 would 
need to go to zero in order for the whole expression to go to zero. But they are 
not independent. Recall that we are looking at different distributions for a given, 
constant energy E. It’s that total energy E = hun u that remains constant, so it’s 
this derivative dE that goes to zero: 

dE^hvdnv = 0 (14.25) 

V 

We now want to find n v as a function of v so that Equations 14.24 and 14.25 are 
simultaneously satisfied. This is easily accomplished by using Lagrange multipliers. 
We multiply Equation 14.25 by an unknown constant —/?, and add the result to 
Equation 14.24: 

d(\n W) - (3dE = 0 (14.26) 


1 Stirling also proposed n! % e n n n \Z2nn [41]. 
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which we can expand to find 

y; [ln(nt, 4- g v ) — In n u - 0hv\ dn v — 0 

V 


(14.27) 


We now derive the term in brackets to 0 by choosing 0 as 

_ ln(n„ 4- g v ) - lnn„ 
hv 

Solving for n u /g u , we find the occupation index is given by 


1 

exp[0hu\ — 1 


(14.28) 

(14.29) 


This distribution maximizes W while holding the total energy E constant. It is 
known as the Bose-Einstein distribution law for photons , or simply Bose-Einstein 
statistics . 


14.7 Biackbody Energy Distribution 

To derive the biackbody energy distribution, consider an object hung by a thread 
in the middle of an enclosed cavity, as in Figure 14.20. This object is in thermal 
equilibrium with its environment, which means that it emits energy at the same rate 
that it absorbs it. 

If the object receives irradiance E and has albedo /?, then its radiant exitance M 
is given by 

M = 0E (14.30) 

If there are many objects in the system with different values of /?, we find that 
E = M/0 is a constant for them all. This statement is Kirchhoff’s law , and it tells 
us that the ratio of emitted to absorbed power is the same for all bodies. 

A biackbody has a reflectance 0 = 1, meaning that it returns to the environment 
all of the energy it absorbs. Let’s look more closely at this energy, following the 
discussion in Fowles [148] and Moller [311]. The information in this section may be 
skipped on a first reading of the book since it is not essential to later material. Our 
goal will be to find a description of how much energy can leave the biackbody at any 
given frequency. We’ll do this by first characterizing the energy that can leave a hole 
in the surface, and then determining the structure of the energy inside the object. 
Coupling these two, we will find how much of the internal energy exits through the 
hole, which is the radiation of the biackbody. 

So we begin by imagining a small hole cut into the surface. By integrating the 
spectral radiant flux density u u over all wavelengths we find the radiant flux density u: 



u { 


(14.31) 
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A blackbody in thermal equilibrium. 


We know that the speed of light is c. The fraction of total energy u inside the 
cavity that can leak out through this hole is then uccD/47r, where d;/47r represents the 
amount of the sphere occupied by the solid angle describing the hole. Integrating 
this over the hemisphere at the hole: 

M = f = (14.32) 

Jq 0 4?t 4 

(see Equation 13.52). The spectral radiant exitance M v is then M v = u u c/ 4. 

Now we want to find the energy u inside the object. To find this, we will simplify 
the situation by assuming that the object under study is a box with dimensions 
A x B x C. We will assume that the system is stable, so that the radiation patterns 
inside the box are standing waves . This means that the radiation will be modeled as 
a vibrating wave, and we will assume that the wave has a value of 0 at the sides of 
the box. 

One of the simplest waves to model is the plane wave, given by 

e j(k-r-ujt) ( 14 . 33 ) 

This sinusoidal wave is traveling in a direction given by the vector k, with a phase 
at the origin given by uj. If we expand out the vectors, we get 


e i(k-r—u>t) = e j(k x x) gjikyy) e j(k t z) e -jut 


(14.34) 
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As stated above, we want these waves to have a value of zero at the box sides. Sine 
waves go through zero every 7r units, so we satisfy our condition in the box if 


k x A = 7T7l x 


k y A = nriy 


k z A = 7 rn z 


(14.35) 


where n x , n y , and n z are all integers. Each set of values (n x , n y ,n z ) is called a mode 
of the system, and represents a particular, stable state. 

The magnitude of the direction vector k can now be written 




Equivalently, 




Vz 2 \ . 2" 2 
■?) =4,r ? 


Equating these two expressions for fc 2 , we find 


(14.36) 


(14.37) 


4- 4- 

A 2 B 2 c 2 


(14.38) 


The next step makes use of two observations. First, note that Equation 14.38 de¬ 
scribes an ellipsoid in space. Then notice that each mode corresponds to a point inside 
the ellipsoid, with integer multiples of the coordinates (2v x A/c,2v y B/c,2v z C/c). 
Figure 14.21 shows this interpretation. If we label the axes as 2i' x A/c, 2v y Bjc, 
2v z Cjc^ then the modes correspond to integer points inside the ellipsoid; that is, 
points on the corners of a lattice of unit cubes. 

So the ellipsoid for a frequency v has volume 


1 47r 2vA 2vB 2vC 


4t tv*ABC 
3c 3 


(14.39) 


where V = ABC . Because the volume tells us the number of these integer lattice 
points inside the ellipsoid, this expression tells us the number of modes associated 
with all frequencies less than or equal to v . 

It turns out that because of polarization there are two photon states that are 
distinct when counting modes, so we need to double our expression above to account 
for them [148,311]. Therefore the total number of modes g per unit volume is 


(14.40) 
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FI O 19 R I 14.21 

An octant of the ellipsoid of energy modes. 


Within some band of frequencies du, we can find the spectral density g v of the modes 
as 

dg __ 87T 2 


9l ' = Tu = ^ u 


(14.41) 


We’re almost there. Now that we know how many modes are associated with 
each frequency, we can find the total energy in the object by finding the energy 
associated with each frequency, and by scaling that energy by the total number of 
modes supported at that frequency. In other words, there are g v resonant modes per 
unit frequency per unit volume. Rayleigh and Jeans supposed that the mean energy 
per mode is given by kT [148]. Then we can find the spectral radiant flux density 
u v — g u kT at a wavelength v\ 


Ul/ ^ Ql/ kT=^ r u 2 (14.42) 

c 6 

Equation 14.42 is called the Rayleigh-Jeans law of radiation. From it, we find 
the spectral exitance is 

. m 1 2nkT 2 ...... 

A/„ = - cu v — —r— v 2 14.43) 

4 c 2 

The Rayleigh-Jeans law has a terrible problem. It tells us that as the frequency 
v goes up (that is, as the wavelength A becomes shorter), the energy in the cavity 
will grow without bound. We would expect enormous radiation at extremely high 
frequencies, with more energy all the time as the frequency goes up. Experimentally, 
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we find just the opposite. The Rayleigh-Jeans law matches some of the observed 
radiation from isolated bodies, but it begins to fail in the region of the ultraviolet, 
and becomes increasingly inaccurate from there on. This prediction of infinite energy 
at shorter wavelengths was called the ultraviolet catastrophe . The word catastrophe 
reveals the extent to which physicists were troubled by this result: a seemingly 
straightforward argument, based on sound classical principles, yielded an answer that 
was not only at variance with experiment but in fact made a ridiculous prediction. 
This quandary was resolved only with the development of quantum mechanics. 

Planck’s idea of the quantum provided a way out of the ultraviolet catastrophe. 
He supposed, as we have seen above, that energy is only available to a system in 
integral multiples of a basic quantum h. Then each mode has an integral number of 
photons, and the energy per mode must be ihv , where i is some integer. When these 
ideas were applied to blackbody radiation, the results closely matched experiments. 
We will derive those results now. 

As before, we write n u for the occupation index (the number of photons in a 
given mode). So 

Snu 2 87 r/u / 3 ....... 

u v = g v hvn„ = —=— hun„ = — 5 —n„ (14.44) 


with a corresponding radiant exitance 


27r/n/ 3 


(14.45) 


The question now is to find how many photons occupy each mode. If we think 
of each mode as a possible photon state, then the distribution of photons in the 
different modes follow the Bose-Einstein statistics developed above. Plugging the 
statistical distribution Equation 14.29 into Equation 14.45, we find 


M v = 


27r/n/ 3 1 

c 2 exp [0hv\ — 1 


(14.46) 


At short wavelengths, we can simplify this expression. When x is small, e x —1 « x. 
So when the frequency is small (/3hv 1 ), we can write 


27 TV 2 1 

c 2 ^exp [/3hv] — 1 


27 tv 2 . hv 

— 7r-hv—— 


2nv 2 1 
— -z-hv- 


(14.47) 


This expression matches the Rayleigh-Jeans law if we set (3 = 1/kT. This seems 
reasonable since the Rayleigh-Jeans law is accurate at large wavelengths [148]. 
Using this value for /?, we find 


e b {v,T) = 


27r/ll/ 3 1 

c 2 exp [hv/kT] — 1 


(14.48) 
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This is Planck’s law for the radiation from a blackbody. Because it is so important, 
the radiant exitance for a blackbody is usually written as e b , as above. It tells us 
the theoretical maximum amount of energy that can be radiated from any object as 
a function of wavelength and temperature. The temperature is measured in degrees 
kelvin. 

In the development of Equation 14.48, we have assumed that the speed of light 
is c; we made this assumption in Equation 14.32. This is only true in a vacuum; 
elsewhere the speed of light is scaled by the index of refraction of the medium. 

Replacing c with c/r](\) gives us the medium-dependent form of Planck’s law 


[311]: 


e bl/ (v,T) = 


2t Thv 3 rj 2 (v) 
C 2 (< »hv/kT _ 



(14.49) 


where rj(v) is the index of refraction of the medium surrounding the blackbody at 
wavelength v . The other constants in Equation 14.48 are given in Table E.3 in 
Appendix E. 

We can find the total energy e b radiated from a blackbody over all frequencies by 
integrating Equation 14.48 with respect to A from 0 to infinity: 



e bl/ (v , T) dv 


(14.50) 


To ease the integration, we make the substitution x — v(h/kT). Then v = ( kT/h)x , 
so dv = ( kT/h)dx . Then, expanding Equation 14.50 and making the substitution 
for v , 


e b (T) = f 
J 0 

-/ 


2irhu 3 ri 2 (u) 
c 2 ( e h»/kT _ !) dv 

2irhi> 3 r] 2 (kTx/h) kT 


/o c 2 (e x - 1) 

- ( T 4 

V h?c 2 ) 


dx 


7 “ 


r 3 

rj 2 (kTx/h) dx 


e x - 1 


(14.51) 


In Chapter 11 we saw a number of choices for the function r/( v). To use any of 
these equations in Equation 14.51, we will need to make the substitution tj(v) = 
rj 2 (kTx/h ), and probably integrate numerically. There are two cases in which we 
can find the integral analytically: when the index of refraction is a constant, and 
when it is linear with respect to wavelength. 


14.7.1 Constant Indox off Roffraction 

If we are willing to assume that r](v) has a constant value 77 *., then we can simplify 
Equation 14.51 considerably. As we will see, this assumption fails for many materi- 
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als, but it is perfectly true in a vacuum and close to true for many gases. Pulling the 
now-constant index of refraction rjk out of the equation gives us 


e b (T) = % 2 


= Vk 


2 



= Vk 2 crT 4 


where a is the Stefan-Boltzman constant, 


_ 2?r 5 fc 4 
° 15 c 2 /i 3 


(14.52) 


(14.53) 


The numerical value for this constant is given in Table E.3. Equation 14.52 is known 
as the Stefan-Boltzman law for blackbody radiation . This law is often simplified 
further by assuming that the index of refraction r)(v) of the surrounding medium is 
not simply a constant, but is in fact 1 at all frequencies. This is only true in a perfect 
vacuum. With this final simplification we find 


e b (T) = oT 4 


(14.54) 


14.7.2 Untar Index of Refraction 

Equation 14.51 may also be integrated analytically when r](u) is a linear function of 
wavelength: r](i/) = Av. Then we can write 


r , r * 3 (AkTxf , 

Jo e x — 1 dX J 0 e* - 1 h* 


8(AkT) 2 n 2 
63 h 2 


giving us a new simplified formula in terms of a;, the linear constant: 

e b (T) = a ,T 6 


where 


167r 7 A: 6 v4 2 


(14.55) 


(14.56) 

(14.57) 


a t = 


63 c 2 h 5 
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14.7.3 Radiators 

No physical object reaches the theoretical maximum emission of a blackbody. But 
it is sometimes convenient to describe the emissive properties of a material by spec¬ 
ifying, on a wavelength-by-wavelength basis, the fraction of light it generates with 
respect to a blackbody. A material described this way is called a graybody. 

The CIE has defined a number of standard illuminants for use in lighting work. 
Some of these approximate the sun under certain conditions; others are simple to 
match with easily manufactured light sources. Tables of these values are given in 
Appendix G. 


14*8 Phtiphirs 

Roughly speaking, a phosphor is a material that absorbs energy and then reemits it, 
generally over some period of time. The lifetime of an excited electron is usually taken 
to be about 10” 8 seconds. If a material responds to incident energy by reradiating 
most of it within 10” 8 seconds after arrival, we call that fluorescence. If the emission 
persists longer than that interval, it is said to be phosphorescence. A material with 
a strong phosphorescent emission is called a phosphor , which means “light-bearer” 
[267]. 

Most phosphors are inorganic (that is, carbon-free) crystals that contain structural 
and impurity defects; that is, the regular crystal lattice is occasionally broken or 
otherwise distorted, and sometimes atoms that do not belong to the lattice appear; 
either in addition to the lattice atoms or instead of them. 

Figure 14.22 shows different processes whereby materials emit radiation in re¬ 
sponse to incident energy. In computer graphics we are chiefly interested in con¬ 
ventional luminescence , due to the transitions of electrons in the outer shells and 
incomplete inner shells of electrons. 

The photons emitted by a phosphor are generally less energetic than those that 
are absorbed; the difference appears as heat. This is known as Stokes' law , which 
can be derived from the law of conservation of energy on a quantum level [267]. 
Stokes’ law can be violated but only rarely and in particular circumstances. 

Localized transitions in a material, such as a single electron being promoted into a 
higher energy state within an atom, result in exponential decays of phosphorescence, 
where the energy at time decreases as e~ at with time t. The general idea is that after 
the electron absorbs energy from the incident photon, it loses a small amount of 
energy (generally as heat) and then falls into a metastable state. A metastable state 
is an energy level from which the electron is forbidden to make a direct transition 
back down to the ground state. The electron will reside in the metastable state until 
it can make it back up to a higher energy state, from which it is able to move back 
down to the ground state. 
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Type of emission 

Radiation sources 

Common types of spectra 

t 

Gamma-ray fluorescence 

Transitions of nucleons in atomic nuclei 

Gamma-ray line spectra 

o 

o 

_c 

a 

X-ray fluorescence 

Electronic transitions in inner completed 
shells of atoms 

X-ray line spectra 

1 

| 

Conventional 

Electronic transitions in inner incom- 

Visible and near-visible 

luminescence 

pleted shells of atoms 

line spectra 

4J 

Js 


Electronic transitions in outer (valence) 

Visible and near-visible 

'o 


shells of atoms and molecules 

band spectra 

& 

V 

c 

UJ 

Thermal radiation 

Transitions of atoms and groups of 
atoms vibrating and rotating; also elec¬ 
trons as in conventional luminescence 

Infrared band spectra 


FIGURI 14.22 

Emissions from luminescent solids. The light we see is due to conventional luminescence. Data 
from Leverenz, An Introduction to Luminescence of Solids, table 7, p. 105. 


Conductors are in general not good phosphors, because they have partially filled 
upper bands. An excited electron can step back down to the ground state through 
those upper levels in many small transitions, radiating heat at each step rather than 
a single photon of energy similar to the photon that was absorbed. 

When an electron absorbs enough energy, in some materials it can leave the 
influence of the atom it belonged to and move into the conducting band , where it 
is free to wander throughout the crystal. Such electrons follow a power-law decay , 
losing energy with respect to t~ n . This type of decay can last for days even at room 
temperature. 

In fact, some phosphors retain over half their initial stored potential phosphores¬ 
cence for six months at room temperature, and longer still at colder temperatures 
[267]. Perhaps the most extreme example involves the illumination of a nonradioac¬ 
tive fluorite crystal to a one-minute burst of ultraviolet light. Using a thousand-hour 
(or six-week) exposure, a photographic image was made of the crystal from its 
phosphorescent emission four to five years after excitation [267]. 

Figure 14.23 gives a schematic view of the radiance emitted by a phosphor in 
response to illumination, and provides a summary of the relevant terminology com¬ 
monly used to discuss the process. 

Leverenz [267] suggests combining the exponential and power-law phosphores¬ 
cent decays with the following empirical formula for the radiance L emitted by a 
phosphor at time t , given an initial radiance Lq and a material-dependent constant b: 






14.8 Phosphors 



FI O U ft I 1 4 • ft 3 

The radiance of a phosphor over time and the terms used to describe it. 1 = rapid growth and 
decay; 2 = slow growth and decay. For the effect of irradiation with A > A p k (emission) during 
phosphorescence, 2 = normal decay {no irradiation), 3 = quenching, and 4 = stimulation. Redrawn 
from Leverenz, An Introduction to Luminescence of Solids, fig. 19, p. 150. 



(14.58) 


It may be surprising that the energy stored in a phosphor may be rapidly depleted 
by exposing the material to another beam of light. This process is called quenching , 
and usually works best for power-law type phosphors. We irradiate the phosphor 
with photons whose energies are equal to or slightly above the energy of the emitted 
photons. The stimulation is able to just raise the stored photons to a point where 
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they can return to their ground state, by virtue of a nonradiative transition ; that is, 
the excess energy is dispelled as heat. 

The variety of means by which materials radiate energy is summarized in Fig¬ 
ure 14.24. 


14*9 Further Reading 

An excellent introductory text to quantum theory for atoms and molecules is Mc- 
Quarrie [295], which requires no more mathematics background than this book. 
Another good introductory quantum mechanics book is Longini [276]. A challeng¬ 
ing but accessible and thorough introduction to quantum mechanics and subatomic 
particles is offered by Sudbery [427]. Many of the optics texts, such as the one by 
Moller [311] referenced in Chapter 11, also provide good introductions to some of 
this material, particularly the blackbody developments. 

An inexhaustible source of practical and theoretical information on lumines¬ 
cence in general and many phosphors in particular is provided by Leverenz in his 
book [267]. 

14.10 Exercises 

ImrclM 14.1 

Research the dispersion of light within crystals, including asterism and chatoyancy 
[437,498,501]. Describe how you would simulate light within gems and crystals. 

ImrclM 14.2 

Investigate the thirty-two crystallographic point groups [488]. Build a system to 
model 3D crystals from a stereographic projection. 

ImrclM 14.3 

Study the optical phenomenon of double refraction . Distinguish the properties of 
uniaxial and biaxial crystals. How would you implement support for such crystals? 
What input would you require from the user? 

ImrclM 14.4 

How much energy, in watts, is radiated by a perfect blackbody at T = 5,780 degrees 
kelvin (the temperature of the Sun [406]) in the visual range [380,780] nanometers? 



Simple absorption and reflection 

Dyes and pigments convert incident radiation (photons) into selectively reflected radiation and internal 
heat. 


Tenebrescence 

Scotophors are similar to dyes and pigments, except that at least part of their selective absorption of 
radiation is nonintrinsic. For example, new absorption bands may be induced by treatment with X rays 
or material particles. 


Luminescense (physical [electronic] action) 

Luminophors, or lumophors, convert part of the energy of absorbed photons or material particles into 
emitted radiation in excess of thermal radiation. 

Fluorophors, or fluors, exhibit only fluorescence. 

Phosphors exhibit phosphorescence (with or without fluorescence). 


Designation 

Means used for excitation (excitant) 

Photoluminescence 

Low-energy photons (visible light, UV) 

Roentgenoluminescence 

High-energy photons (X rays, gamma rays) 

Cathodoluminescence (electroluminescence) Cathode rays, beta rays 

lonoluminescence (radioluminescence 1 ) Alpha particles, ions 

Triboluminescence 

Mechanical disruption of crystals 

Designation 

Duration of detectable afterglow (persistence) 

Fluorescence 

Shorter than about 10 -8 sec for optical photons 

Phosphorescence 

Longer than about 10“ 8 sec for optical photons 

Effect of irradiation or heating during phosphorescence 

Stimulation (Ausleuchtung) 

Phosphorescence intensity increased during irradiation 
or heating 

Quenching (Tilgung) 

Phosphorescence intensity decreased during irradiation 
or heating 


Luminescence (chemical action) 

Designation 

Excitant 

Chemiluminescence 

Energy from chemical reactions 

Bioluminescence 

Energy from biochemical reactions 


Designations that are not recommended 

Candoluminescence (non-black-body emissions observed at very high temperatures) 

Thermoluminescence (phosphorescence obtained at various temperatures) 

1 Radioluminescence has been used to describe luminescence excited by any or all radioactive- 
disintegration products. In the case of radium, the alpha-particle excitation predominates. 


FI O II R I 1 4.24 

Some types of luminescence and related phenomena. Data from Leverenz, An Introduction to 
Luminescence of Solids, table 10, p. 148. 





As they made them way along, other familiar 
landmarks came into view , and they excitedly 
pointed them out to one another. Within an 
hour or two they would he down . But then 
Crean spotted a crevasse off to the right , and 
looking ahead they saw other crevasses in their 
path . They stopped — confused. They were on a 
glacier. Only there were no glaciers 
surrounding Stronmess Bay They knew then 
that their own eagerness had cruelly deceived 
them . The island lying just ahead wasn't 
Mutton Island, and the landmarks they had 
seen were the creations of their imagination. 

Alfred Lansing 

(“Endurance: Shackleton s Incredible Voyage,” 1959) 
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15.1 Introduction 

In this chapter we will look at what happens to light as it reflects off a surface or 
passes through a volume. In contrast to our atomic- and molecular-sized points of 
view in Chapter 14, here we are interested in macroscopic phenomena, usually on 
a human scale. In general, we would like to find a description of the scattering 
distribution function at all points in the environment. 

The term shading is generally used to describe the process of computing the 
light that leaves a point either by self-emission or propagation. The point under 
investigation, which may be on a surface or in space, is called the shading point. 
The function that characterizes the light leaving a shading point as a function of the 
light arriving upon it is called the shading model at that point. Typically we are 
concerned with the light leaving within some solid angle (which may be as large as 
a hemisphere or sphere): we call this the shading exitance solid angle. 

Shading is a field that is still under active development, driven by three forces that 
are often mutually exclusive: accuracy, expressiveness, and speed. For images that 
predict and match real scenes we need to have numerical precision when computing 
how much light is reflected off a surface or transmitted through a volume. For artistic 




722 


15 SHADING 


freedom and creativity, we would like shading models that let us make objects look 
any way we desire them to look, regardless of the difficulty (or impossibility) of 
achieving that behavior in reality. Finally, any shading model must be fast, since it 
may be executed many millions of times for a single image; if a model satisfies the 
other two criteria but is too slow, it will be a theoretical construct only and will not 
find practical use. 

The question of accuracy is important because only sometimes do we really need 
to match materials with great precision. This is fortunate, because the specific details 
of how materials reflect light varies greatly even for very similar substances. Different 
finishes, different densities of mixtures, and different temperatures all affect how a 
material reflects light. Finding a single practical equation, or set of equations, that 
will match all materials is probably impossible. To match even a single material 
with a physical simulation usually requires knowledge of its atomic and molecular 
composition. When possible, in computer graphics we work with approximate 
shading models. There are a variety of models, each appropriate for a different 
class of surfaces. All of the shading models in this book make many simplifications 
regarding the atomic and molecular structure of the material. Those that document 
all their assumptions carefully are typically called physically based models; those 
that just look good or are useful are called empirical models. Although they have 
a finer pedigree, physically based models can produce less realistic results than an 
empirical model carefully hand tuned for a particular type of material. 

When we care only about opaque surfaces in vacuum (or homogeneous air which 
we are content to treat as vacuum), then we need only find a description of the 
bidirectional reflection distribution function (BRDF) f r at each point on each surface. 
If the surfaces are partly transparent, then the bidirectional transmission distribution 
function (BTDF) f t must also be considered. Together, these functions form the 
bidirectional scattering distribution function (BSDF), which we write as simply /. 
The BSDF can be applied to points in space as well as points on surfaces. 

Because transmission is generally handled very similarly to reflection, when dis¬ 
cussing surfaces in this chapter we will concentrate on the BRDF, implicitly including 
the BTDF by analogy. 

Almost all shading models assume that there is no fluorescence, phosphorescence, 
or significant polarization. Thus, we can speak of each frequency of light indepen¬ 
dently of all others, and we need not concern ourselves with what light has arrived 
before the moment of inquiry. So we will write our shading functions without the 
arguments that encode the wavelength, time, and polarization of the involved light. 
The general approach for including polarization and color is to run several simulta¬ 
neous simulations, one for each state of polarization and wavelength. For example, 
to render a color image, we might calculate the radiance at n different wavelengths, 
and then convert this sampled-wavelength description to a single m^eRGbRQM color 
appropriate for display on a raster CRT. More efficient methods are sometimes pos¬ 
sible, particularly by reusing some geometric information such as visible surfaces 
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and the locations of shadows. However, we must be careful to handle refraction 
properly, since this characteristic of light can change the path the light takes, and 
thus the locations of shadows and indirect illumination from different points in the 
scene. 

Even if there are phosphorescent materials in the scene, there will be no problem 
as long as the image is simulated quickly enough. In other words, if the scene 
includes a simulated camera, as long as the camera shutter is open for a sufficiently 
brief interval, phosphorescence doesn’t matter. If we start with no radiation and 
conclude within about 10 -8 seconds, then even if there are phosphorescent effects, 
they will not have the time to manifest themselves. Since the speed of light is about 
3 x 10 8 m/s, we can ignore the time factor in this case if the scene can be surrounded 
within a sphere with a diameter of 3 meters. Note that if another image is generated 
immediately after this one (say for an animation), then the initial condition that the 
scene is black will no longer be satisfied. If there are no phosphorescent materials in 
the scene, then the time interval used for the simulation doesn’t matter. 

It is often useful to distinguish direct and indirect illumination upon a shading 
point. Direct illumination is that light which arrives from a luminaire , or light 
source, without interacting with any other objects along the way. Typical light 
sources include thermal radiators, such as flames or incandescent filaments, though 
fluorescent bulbs and even decaying phosphors may be considered sources of direct 
illumination. In general, light arriving at a point A from some other point B is 
considered direct if it is generated at B by any of these self-emitting mechanisms. 
If the arriving light was reflected or transmitted before arriving at A , we speak of 
this as indirect illumination. For efficiency, almost all computational methods take 
special notice of direct illumination. However, they vary in their approach to indirect 
illumination. 

A useful conceptual model of shading is to think of the shading point as sur¬ 
rounded by a set of directional functions, which can be imagined as information 
painted on spheres. On the surface of each sphere is a scalar function. We can draw 
the magnitude of the function on the surface of the sphere using black for 0 and 
white for 1 to create a sphere that is dark in some places and light in others. An 
alternative is to represent the magnitude of the function in each direction by scaling 
the radius of the sphere in that direction. If we do this, we don’t actually have a 
hemisphere any more, but rather a radial blob. We will use this convention for 
drawing the magnitude of these functions, but we’ll continue to speak of “spheres” 
and “hemispheres” rather than “blobs” since we really only want to refer to the 
function defined on the spheres, and not the way we have chosen to draw them for 
illustration. 

Figure 15.1 shows three spheres around a point: the illumination sphere , the 
BRDF sphere, and the radiance sphere , respectively describing the incident light, the 
surface BRDF, and the radiated light. We can also imagine an emission sphere for 
points that generate their own light (say by thermal excitation). 
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A set of spheres around a shading point, (a) An illumination sphere, (b) A BRDF sphere for a given 
direction, (c) An emission sphere, (d) A radiance sphere. 


The illumination sphere gives the illumination (typically the incident radiance) 
of the light arriving from every direction around the shading point (when the point 
is on the surface of an opaque object, this is just a hemisphere). The BRDF sphere 
gives the BRDF for light incident from a particular direction. The radiance sphere is 
the resulting description of the radiance leaving the surface in all directions, either 
by propagation or self-emission. 

For each incident direction, we can create a sphere that gives the magnitude of 
the reflected radiance in each direction. To compute the complete radiated energy, 
we can step through each direction in the illumination sphere, compute the correct 
BRDF sphere, scale it by the magnitude of the incident energy, and add the scaled 
reflectance function to a running sum of propagated radiance stored in the radiance 
sphere. When every input direction has been considered, the resulting spherical 
function describes the propagated light in all directions. 

We can use this concept to help organize the shading process. The first step is to 
find the illuminance sphere, telling us how much light is falling on the point from 
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all directions. When this gathering step has completed, we can then combine the 
illumination with the BRDF. When that step is finished, we then use the radiance 
sphere along with the emission sphere to find the energy radiated into any given solid 
angle. 

In practice, we rarely execute these steps completely, and rarely are they so 
decoupled. Many shading algorithms are composed of a set of interconnected ap¬ 
proximations in all three steps. For example, if the surface is a perfect mirror, then 
when we look at it from a given direction, the only illumination that matters is the 
radiance coming from the reflected direction with respect to our gaze. Gathering 
the entire illumination sphere and running it through the complete BRDF for all 
outgoing directions would be a waste of time if we only need the single sampled 
direction. 

Less drastically, many algorithms sample the illumination sphere carefully in 
directions from where direct illumination is likely to arrive, since in general it is 
very difficult to predict from what other directions light might be arriving. The 
rest of the sphere may be sampled more coarsely, or perhaps not at all and simply 
approximated. Depending on our needs, we may have to reconstruct and filter the 
illuminance and radiance spheres before using them to estimate the outgoing light in 
a particular solid angle. 

Those algorithms that explicitly gather illumination information only from the 
direct light source are called local illumination models. Typically these shading 
models include some form of compensation inside the BRDF for the incomplete illu¬ 
mination sphere. The ambient light term found in many hardware implementations 
of shading is an attempt to use a single value to approximate the total contribution 
of the unsampled illumination sphere. 

Algorithms that estimate indirect lighting information are called global illumi¬ 
nation models. Some of these models care only about a small number of specific 
indirect directions. For example, when viewing a nearly perfectly smooth mirror, 
only a small reflected solid angle needs to be considered. Since the BRDF in this case 
limits the part of the illumination sphere in which we are interested, we can save 
ourselves work and not bother with evaluating samples elsewhere. 

The two terms local and global describe the extremes of a continuum; most useful 
shading models fall somewhere in between. 

For simplicity, we will assume that all normal vectors have unit length, and that 
all derived vectors in this chapter are normalized after they are computed; that is, 
they have length 1. So when we want to think of finding the average of two vectors 
A and B, we might write C = (A + B)/2, which matches our intuitive idea of 
averaging, but we actually mean to follow that with a normalization step, so the 
actual computation would be 


r= a + b 
l|A + B|| 


(15.1) 
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1 5.2 Lambort, Phong, and Bllnn-Phong Shading Modols 

The three simplest shading models are named for their developers: the Lambert 
model, the Phong model, 1 and the Blinn-Phong model . All of these models are 
usually applied only to simple samplings of the direct illumination, usually from 
idealized point and directional light sources. 

The Lambert model applies to a perfectly diffuse surface; the BRDF is simply a 
constant for all directions, wavelengths, and polarizations. 

The Phong model [69] introduces an ad hoc exponentiated cosine to model 
specular highlights. Suppose light from a source in direction S is arriving at a surface 
point with normal N. The perfect specular reflection of the incident light is along 
the reflection direction R, as shown in Figure 15.2. The method of Phong shading 
approximates this reflected light as a cone centered around R with an exponentially 
decreasing intensity. So the reflected light in any other direction V can be found by 
computing the cone angle cos a = V-R, and then modulating this by using (V-R) fce , 
where k e is a roughness (or shininess) exponent. Small values of k e (such as 1 or 
2) model rough surfaces; larger values (such as 30 or 40) model smoother, shinier 
surfaces. 

The Phong shading equation for the radiance in direction V is typically given in 
the form 


71 — 1 71-1 

L(V) = k a L a + k d J2 Li(Si • N) + k s J2 Li(Ri • V) fc * (15.2) 

7—0 7=0 

where L* is the radiance due to source i arriving along direction S*, and the reflected 
vector R* is given by 


R* = S* + 2(Sj • N)N OOi 

The scalars kd and k s are used to adjust the overall diffuse and specular reflectivity 
of the surface; to conserve energy, kd + k s < 1. 

Notice the first term on the right-hand side is k a L a ; this is called the ambient 
component of the shading model. It attempts to account for all the indirect light 
in the scene, since the remaining two summations only sum contributions over a 
finite number of infinitely thin solid angles. Since most of the illumination sphere 


^ui-Tuong Phong was Vietnamese. In that culture, the first name is a hyphenated construction of 
the family name (Bui) and a generation name (Tuong) shared by all members of that generation of that 
family. The last name (Phong) is the individual’s unique name. For consistency with the other models, the 
technique due to Bui-Tuong Phong should probably be called Bui shading , but the term Phong shading is 
now firmly established in the literature [107]. 
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MOURI 19.3 

Some geometry useful for shading. 


is completely unsampled, this term is introduced to at least simulate some sort of 
indirect lighting. The philosophy is that indirect light may be coarsely modeled as 
a low-level “background” illumination that is constant everywhere in the scene. So 
we pick a value for this constant and simply add it in. 

Although it is certainly ad hoc, ambient light is essential in simple models such 
as this. If there are only a few (point or directional) light sources in the scene, and 
if they are not in a position to illuminate some object (perhaps they are behind it, 
or an intervening object casts a shadow), then the object will be perfectly black. We 
know from experience that when there’s at least enough light in a scene for us to 
see objects reasonably clearly, then even those parts of the objects that don’t receive 
direct lighting are partly illuminated. For example, consider a table in a small room 
lit only by a single light bulb directly overhead; the floor under the table receives no 
direct illumination, but it is not pitch black. Ambient light is an inexpensive way to 
include at least some approximation of this important light. 

A variant of Phong shading was introduced by Blinn [46], which is significant 
not because it is a more accurate physical simulation (it uses similar empirical 
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approximations), but because it avoids computing the reflection vector R i and is 
thus slightly faster. In Blinn-Phong shading , the bisector H = (V + S)/2 is used 
instead of the reflected vector R. So the reflection is given by (H • N)* e . The shading 
equation is otherwise the same as for the Phong model: 


n—1 


n—1 



(15.4) 


where the terms are as in the Phong model, except for the bisector 

Hi = (V + S0/2 


(15.5) 


Neither the Phong model nor the Blinn-Phong model is normalized; we are free 
to choose coefficients A; 0 , k <*, k s , and k e with almost complete freedom. Often even 
the minimal energy-conservation condition mentioned above is ignored, and this is 
entirely reasonable given the simple nature of these models. 

15.2.1 Diffuse Plus Specular 

The heart of the techniques just described is the separation of the BRDF into diffuse 
and specular components. Is this justified? Judd and Wyszecki [232] propose that 
this is an appropriate model for a material that may described as a rough (diffuse) 
surface upon which there are small specularly reflective patches. As the amount 
of surface covered by the patches increases, the specular component of the shading 
model increases. To create such a material, take a piece of glass with a finely ground 
surface and polish a few spots, and then gradually enlarge those spots until the whole 
surface is perfectly smooth. 

This is a reasonable model for many materials. Figure 15.3 shows the reflectivity 
for typewriter paper for different angles of incidence. There seems to be a rather 
circular diffuse component at normal incidence, along with the introduction of a 
blunt reflection as the incident angle comes down to the horizon. The models 
discussed above seem to capture this material pretty well over most of its range 
(though the grazing reflection near the horizon is not accounted for). 

On the other hand, one of the most famous exceptions to this rule is Earth’s moon. 
At a full moon the sun, Earth, and moon are almost colinear, as in Figure 15.4. We 
know that the moon is roughly spherical, so we would expect that a purely diffuse 
moon would appear bright in the center and fade off to the edges. But during a 
full moon, except for surface features, the moon looks like a flat disk of uniform 
brightness. 

In fact, Blinn has noted that some parts of the moon may indeed be modeled by a 
simple combination of a diffuse scattering term and a simple forward scattering term 
[48]. But this doesn’t explain why the entire moon appears equally bright across 
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M0URI IB. 3 

The reflectivity of a piece of typewriter paper. Redrawn from Siegel and Howell, Thermal Radiation 
Heat Transfer , fig. 5-10, p. 147. 



MOURI 1 S.4 

The geometry of a full moon. Redrawn from Siegel and Howell, Thermal Radiation Heat Transfer , 
fig. 5-12, p. 149. 
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FlOUftl 15.5 

The reflection function for the moon. The dashed line represents 1 / cos 6. Redrawn from Siegel 
and Howell, Thermal Radiation Heat Transfer , fig. 5-11, p. 148. 


its surface during a full moon. Because the moon is round, the average normal of 
its surface makes an angle 9 with the sun that increases as we work our way out 
to the edge. This means that the radiance striking the moon goes down with cos0. 
For the moon to appear equally bright across its face, its reflection function must 
scale as l/cos0, and indeed it does, as shown in Figure 15.5. This is probably a 
result of a combination of backscattering and the small-scale l^p^J^gy of the moon’s 
surface [406]. 

Most materials are not described by simple surface models such as a combination 
of diffuse and specular components, or even the more sophisticated models we’ll 
see below. As mentioned earlier, a material’s visual appearance is dependent on 
how it is applied, the medium through which it is viewed, the density of various 
other substances in the material, its thickness over some substrate, the properties 
of the substrate, and so on. Consider some paint applied to a surface [232], as in 
Figure 15.6. 

The gloss (or shininess) of the paint depends on how much pigment is within the 
vehicle (oil or water). If there is only a bit of pigment, as in a paint enamel, then when 
the paint dries the outer surface will be smooth and flat, and very glossy. Suppose 
that the paint particles are larger; then when the paint dries, the surface will reveal 












15.3 Cook-Torranct Shading Modal 


73 1 



Three different types of paint, (a) Small particles lead to a glossy finish, (b) Medium particles lead 
to a semigloss finish, (c) Large particles lead to a diffuse finish. 


the presence of the larger bumps by being bumpier itself, so we have a semigloss 
finish. At the far extreme there is just enough vehicle to hold the particles together, 
as in a cold-water paint; the final result will be very diffuse. Thus it is not enough 
to simply specify a paint finish or a color in order to predict how a painted material 
will look. Closer examination will also reveal brush marks, varying thicknesses, and 
other phenomena that can affect how a paint appears. 


1 5.3 Cook-Torrance Shading Model 

We now turn to Cook-Torrance shading, a shading technique that is based on the 
physics of a surface. It combines work both in applied physics and computer graph¬ 
ics. We will discuss it in some detail to get the flavor for this type of physically based 
shading model. 

The Cook-Torrance model has three main components: a microfacet model of the 
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surface, a Fresnel term describing reflectance, and a roughness term parameterizing 
the microfacet distribution. We will now discuss these in turn. 


15.3.1 TorrancD-Sparrow Microfacets 


The starting point for Cook-Torrance shading is the geometric description of a surface 
given by the Torrance-Sparrow model [439]. Torrance and Sparrow assumed that 
a surface is composed of many small V-shaped grooves, which are lined with flat 
mirrors called microfacets , as in Figure 15.7(a). A surface is made up of a sea of 
these mirrored grooves, where the direction of each groove is randomly oriented 
with respect to the others. 

The complete surface reflects light via two mechanisms: reflections off of the 
microfacets, and interaction with the substrate below them. Single reflections off 
the microfacets are responsible for specular reflection, while multiple microfacet 
reflections and scattering within the substrate cause diffuse reflection. 

The geometry of the grooves means that the walls of grooves can block some of 
the light that would otherwise fall on a facet (called shadowing ), as in Figure 15.7(b), 
and that some of the light reflected from a facet can be blocked on its way out (called 
masking ), as in Figure 15.7(c). 

These geometrical effects influence how much light is specularly reflected by the 
surface, and in what directions. They are typically gathered together in a geometry 
term denoted G. Blinn [46] gives a detailed description of this term, which may be 
summarized for a point with normal N, illuminated from a direction S, and viewed 
from a direction V as 


G = min |l, 


2(N • H)(N • V) 2(N • H)(N • S) 


(V • H) 


(V-H) 


(15.6) 


where the bisector H is given by H = (S -I- V)/2, as before. 


15.3.3 Prttntl's Formulas 

The amount of light reflected and refracted at an interface is a function of the 
wavelength of the incident light, the geometry of the surface and light, and the angle 
of incidence. These effects are summarized by a set of formulas known as Fresnel's 
formulas . 

Fresnel’s formulas may be derived by writing down Maxwell’s equations at a sur¬ 
face boundary, and making sure that energy and continuity constraints are satisfied 
after reflection and refraction. This is a straightforward procedure, but one that 
would take us on a rather prolonged tangent with few beneficial side effects. A full 
derivation may be found in any modern optics text, such as Moller [311]. (Others 
are mentioned in the Further Reading section.) 
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Recall from Chapter 11 that conductors are characterized by a complex index 
of refraction N(v) 9 which is a function of the frequency v . From Equation 11.15, 
N(v) = r)(v) + where rj(v) is the simple index of refraction and n(v) is the 

extinction coefficient . Since we’re writing all of our equations for a single frequency 
in this chapter, we will not write out the explicit dependency on v; that is, N = r)+jK. 

Because of how they are derived, Fresnel’s equations are usually written in terms 
of the polarization of the reflected light. These terms are typically labeled either p or 
|| for the parallel component and $ or _L for the perpendicular component (s stands 
for senkrecht , the German word for “perpendicular”). 

At the interface between two materials, we move from a material with complex 
index of refraction Ni to a material with complex index of refraction 7V 2 . We can 
write the relative complex index of refraction N as the ratio of these two values: 


N = 


*2 

Nx 

m + j* 2 
m +j* i 


(15.7) 


We can put this into standard complex-number form by multiplying with Ni/N^i 


N = 


T)2 +jK 2 m -JK 1 


m + m -i*i 


mm + . m K 2 - mm 
m 2 + m 2 J m 2 + k \ 2 


(15.8) 


Using this for the relative index of refraction N = r) + the Fresnel formula is 

_ a 2 + b 2 — 2a cos ^ + cos 2 6 

3 a 2 + b 2 + 2a cos 0 + cos 2 0 

_ a 2 + 6 2 - 2a sin 0 tan 0 + sin 2 0 tan 2 0 (15 9) 

P s a 2 + b 2 + 2a sin 0 tan 0 + sin 2 0 tan 2 0 

where 0 is the angle of incidence, and a and b are given by 

2a 2 ~ — k 2 — sin 2 0) 2 + 4^ 2 k 2 + (rj 2 — k 2 — sin 2 0) 

26 2 = yj(r) 2 — hi 2 — sin 2 0) 2 + 4 t) 2 k 2 — ( r) 2 — k 2 — sin 2 0) (15.10) 


The Fresnel reflection as a function of the angle of incidence is plotted in Fig¬ 
ure 15.8 for an air-glass boundary (the relative simple index of refraction rj is about 
1.5). Notice that the perpendicular term drops to zero at a particular incident angle 
0/. This angle is given by 

tan0r = — 

m 


(15.11) 
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The Fresnel reflectance for an air-glass boundary with index of refraction 1.5. We show the two 
polarized components and the term for unpolarized light. Redrawn from Judd and Wyszecki, Color 
in Business , Science and Industry, fig. 3.2, p. 400. 


and is known as Brewster’s angle ; Equation 15.11 is known as Brewster’s law . At 
Brewster’s angle the reflected light is entirely parallel-polarized. 

In Figure 15.9 we show the Fresnel reflectance for unpolarized light at a number 
of different relative indices of refraction. 

The reflectance F for polarized light is a weighted sum of the polarized compo¬ 
nents: F = sF s +pF py where s-f p = 1. Unpolarized light is described by s = p = 1/2, 
so for unpolarized light F = (F s 4 - F p )/ 2. The equations for a dielectric-dielectric 
interface can be found by setting k to zero and using 77 as the index of refraction for 
the second dielectric. 

Note that when 6 = 90°, then F s = F p = 1 , regardless of the constants rj and k. 
This is why surfaces such as rough paper appear shiny when we view them at a very 
shallow, grazing incident angle. 

To find the Fresnel coefficients for transmission, we observe that when there is no 
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PlOUIkl 1 8.9 

The Fresnel reflection for unpolarized light for different indices of refraction. Redrawn from Judd 
and Wyszecki, Color in Business , Science and Industry , fig. 3.3, p. 401. 


all the incident flux is either reflected or transmitted: <Fj = F r $i + 

If we expand the flux in terms of radiance [181], we find that in terms of the incident 
radiance L iy the reflected and transmitted radiances L r and L t are as follows: 


L r = F r Li 


L t = F t 



(15.12) 


The ratio of the simple indices of refraction rji and rj t of the incident and transmitted 
materials is necessary because the different solid angles occupied by the incident and 
transmitted beams. 

Computing the Fresnel term requires knowledge of rj and k at the appropriate 
wavelength. When these terms are not available, but the reflectance at normal 
incidence is known, Cook and Torrance suggest a practical alternative. For metals 
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and nonmetals alike, they set k = 0 to get a value for 77 at normal incidence. They 
then use that 77 and k for the Fresnel computation. 

The Fresnel equation for unpolarized light at normal incidence, with the extinc¬ 
tion coefficient k = 0 , is 


1 (g-c) 2 r [c(g+c) - i] 2 i 

2 (g + c) 2 \ [c(ff + c) + lj 2 / 




where 


c = cos 0 = V • H 

S 2 = 77 2 + c 2 -1 (15.14) 


At normal incident, 0 = 0 , so c = 1 and 0 = 77 , and the Fresnel coefficient F 0 is 


F 0 = 



(15.15) 


which we can solve for 77 : 


1 + \/7o 


(15.16) 


This 77 may then be used to compute the Fresnel term at other angles of incidence. 


15.3.3 RovghMM 

The last term in the model characterizes the distribution of the slopes of the micro¬ 
facets. As light arrives at different angles, different distributions of microfacets will 
cause different patterns of reflection. 

We use the term D to describe the facet slope distribution function. Blinn [46] 
has presented a variety of slope models; one of the simplest is the Gaussian model: 

D = ce - (a,m)2 (15.17) 

for an arbitrary constant c. The angle o = cos -1 (N-H). 

The constant m is the RMS slope of the microfacets. A small value of 7 n, such as 
0 . 2 , indicates that the surface is smooth and the grooves are shallow, and it produces 
a sharp highlight. Large values of m, such as 0 . 8 , indicate a rough surface with deep 
grooves, and produce more spread-out highlights. 

Cook and Torrance [103] have pointed out that the Beckmann theory can de¬ 
scribe both rough and smooth dielectrics and conductors. For rough surfaces, the 
Beckmann distribution function is given by 

D = exp[—((tan a)/m) 2 ] 
m 2 cos 4 a 


(15.18) 
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This has the advantage over Equation 15.17 in that it requires only the one parameter 
m to characterize the surface roughness. 

A surface may be characterized as a combination of several different roughnesses 
at different scales. We can combine the different scales together with simple linear 
weighting: 

D = ^2w k D(m k ) (15.19) 

k 

using a weight Wk on the distribution with RMS slope 


15.3.4 Ik# Cook-lferranco Model 

The Cook-Torrance model combines the preceding pieces. We begin by considering 
an opaque surface, and split its BRDF into two pieces: one to handle the specularly 
reflected light, the other the diffusely reflected light. This distinction is motivated by 
our discussion earlier that the BRDF for some surfaces may be described by the sum 
of a purely diffuse term and a purely specular term, so it seems reasonable to think 
of writing the two pieces separately. Later on we’ll see more complex BRDFs that 
don’t decompose along such intuitive lines, but can be usefully broken down into 
pieces anyway. In general, it doesn’t matter how we decompose the BRDF as long 
as all the pieces add up to the original. 

In this case the diffuse and specular pieces are well understood, and we can write 

fr = sf s + df d (15.20) 

where s and d are the specular and diffuse coefficients respectively, and s + d < 1 . 

The ambient term is computed as in the previous models, with a hemispherical- 
reflectance term in the BRDF. Ideally, this should be scaled by the amount of the 
hemisphere that isn’t otherwise accounted for by direct light, but since ambient light 
is a very crude approximation anyway, there’s little additional harm in assuming that 
the entire hemisphere contributes ambient light. 

The total reflected radiance is given by 

71-1 

L r = L a R a + £ Li (N • S i)(sf s + df d ) dwt (15.21) 

7 = 0 

for ambient radiance L a reflected by R a , and for a sum over n light sources with 
radiance L, in direction S* occupying solid angle du{ from the shading point. 

We have seen the diffuse term f d in Chapter 13, recalling Equation 13.59, 



(15.22) 
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The term f s is found by combining the Fresnel, masking, shadowing, and distribution 
terms: 


1 F-D-G 
n (N • S)(N • V) 


(15.23) 


for illumination coming in from direction S. 

Recall that Cook and Torrance find rj from the reflection at normal incidence, 
and interpolate from there. The color of the reflected light may also be interpolated. 
Suppose that for some wavelength A, we know the reflected flux $o for normally 
incident light (that is, 0* = 0), and also the Fresnel coefficient F 0 at that angle. The 
reflected flux $ n / 2 at perpendicular incidence is the color of the incident light, because 
the Fresnel coefficient F n / 2 = 1.0 at every wavelength, as we saw above. Then we 
can linearly interpolate between these two fluxes given some incident angle 0,: 


$0 = $0 + (*«,2 ~ *o) 


max(Q, Fp - F 0 ) 

fV/2 “ *0 


(15.24) 


where Fq is the Fresnel coefficient at 0. 

A set of vases rendered with the Cook-Torrance model is shown in Figure 15.10 
(color plate). These include vases made of carbon, rubber, obsidian, lunar dust, and 
rust. 

The combination of Fresnel reflection and the microfacet distribution means that 
the peak of the specular reflection function is no longer at the angle of perfect specular 
reflection. Suppose that the microfacets are equally distributed in all directions, so 
there are as many facets being struck at small incident angles as at large ones. The 
Fresnel curves tell us that at a large incident angle the reflectance is greater, so these 
facets reflect a bit more light. This pushes the peak of the reflection function a bit 
further from the normal than the perfectly specularly reflected vector [232], 


15.3.5 Polarization 

Polarization was incorporated into the Torrance-Sparrow model by Wolff and Kur- 
lander [487]. They used Jones matrices (see Section 11.4) to describe the polarization 
of rays of light, and tracked the changes in polarization when the light was either 
transmitted or reflected across an interface between media. 

They distinguished between the specular and diffuse components of the light 
during such reflections and transmissions, and accounted for the phase difference 
that occurs at such transitions, which is important in keeping an accurate record of 
the polarization state. 

In their implementation they followed light of only a certain frequency (or a very 
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narrow range of frequencies), which is necessary since different frequencies undergo 
different changes in polarization at interfaces. 

The tracking of polarization in a geometric-optics simulation can be a tricky 
business, since as we saw in Chapter 14, photons are part of the class of subatomic 
particles known as bosons , and we cannot distinguish one photon from another. So 
the state of any particular photon can only be characterized in the sense that the 
entire beam can be characterized, and individual photons must be accounted for 
probabilistically [427]. When we treat light beams as containing great numbers of 
photons, then this distinction becomes less relevant. 


15.4 Anisotropy 

The previous models have all assumed that the surfaces reflected equally from any 
direction of view; only the angle made by the incident light with the surface normal 
mattered. We now turn from these isotropic models to those with explicit anisotropy . 
Recall that a surface that is anisotropic is one for which not only the angle of incidence 
matters, but also the angle <f> of the incident light with respect to some distinguished 
direction. In practice, we can determine if a material is isotropic by locating a small 
planar piece of the surface in a fixed position with respect to an observer and a 
light source, and then rotating that piece about an axis defined by its center and 
its normal, as shown in Figure 15.11. In this case the angle of incidence remains 
unchanged. If the light reaching the observer is unchanged through a full rotation 
of the material, it is (at least at that point) isotropic; otherwise it is anisotropic. 

Using a goniometer , we can measure the reflectivity of a sample, and in particular 
its isotropy. Figure 15.12 shows a physical apparatus consisting of a piece of material 
surrounded by an opaque hemisphere. There are two holes in the hemisphere, and 
through one shines a narrow beam of light. Just outside the other is a detector. If 
the apparatus is swung around the normal to the surface, then we can measure the 
reflectance as we swing around the material. The amount by which the reflectance 
changes is one measure of the anisotropy of the material. We can also independently 
move the two holes nearer to the pole or the equator to see how the isotropy varies 
with the angles of incidence and reflection. 

Usually an anisotropic surface is considered to possess an intrinsic grain , or 
a distinguished direction lying on the surface in which the surface is particularly 
smooth, or at least at its smoothest. A good example is satin, which is composed of 
very fine threads that are are closely woven side by side. As the surface is rotated, 
we can see a definite preferred position along which light is more strongly reflected 
than any other; this is the direction when we are looking at the sides of the threads, 
rather than along their lengths and at the grooves between them. Haii; velvet, and 
brushed aluminum (such as that found on the front of many stereo systems) are other 
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If the light reaching the detector does not change as the patch is rotated about the vector V, the 
material is isotropic. 


common examples of anisotropic materials with obvious grains. It is important to 
determine this distinguished direction when applying an anisotropic shading model. 


15.4.1 Ihm Kafiya Modol 

The model presented by Kajiya [233] is based on Kirchoff diffraction theory. We 
will not go into this model in detail because it is based on physical optics (that is, the 
wave nature of light), rather than the geometrical optics we are using in this book. 

The basic idea is that a rough surface is replaced by its local approximate tangent 
plane. This plane is oriented , with one of the directions lying parallel to the surface 
grain. Kajiya derives an expression for the scattering formula for such a surface, 
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A goniometer set up to measure isotropy. 


including the angle made by the incident light with the preferred direction of the 
oriented tangent plane. The Kirchoff approximation is valid for surfaces that are 
relatively smooth, but it breaks down if there is appreciable self-shadowing, or if the 
surface is rough enough to cause multiple scattering at the surface [344]. 


15-4.2 TIm PMlIa-romlM- Model 

Consider the anisotropic materials of satin and hair; these can be described as long 
thin cylinders tightly packed parallel to each other. Similarly, brushed aluminum can 
be described as many tiny locally parallel round scratches inscribed on the metal. 
We use the term locally parallel because though the scratches are close packed and 
do not overlap, they may not be straight; for example, they may be arranged in 
concentric circles. 
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Cylinders parameterized by the intercylinder distance d and the height h. Redrawn from Poulin 
and Fournier in Computer Graphics (Proc . Siggraph *90)> fig. 2, p. 275. 


This model is similar in spirit to the Torrance-Sparrow model, but there are a few 
important differences: the grooves have a circular rather than V-shaped cross section, 
they may be both positive (sticking out of the surface) and negative (scratched into 
the surface), they are not randomly scattered but locally parallel, and they need not 
be lined with mirrorlike facets but with any material. 

The cylindrical-scratch model was first used in graphics for anisotropic reflection 
in special cases by Miller [304]; it was later generalized by Poulin and Fournier [344]. 

They suggested that a surface made up of many small parallel cylinders may be 
parameterized by two values: the distance between cylinder centers and the depth to 
which they are embedded in (or gouged out from) the surface, which they called the 
floor . For positive cylinders, they called the distance from the cylinder’s top to the 
floor the height . Figure 15.13 shows these parameters. 

They analyzed this geometry and came up with formulas expressing the shadow¬ 
ing and masking effects corresponding to the Torrance-Sparrow geometry term G . 
They also derived similar formulas for the case where the cylinders are scratched 
into the surface. 
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The three pieces of the BRDF in the HTSG model. Redrawn from He et al. in Computer Graphics 
(Proc. Siggraph *91), fig. 6, p. 178. 


15.5 Tho HTSG Model 

An experimentally verified shading model based on wave optics has been developed 
by He, Torrance, Sillion, and Greenberg [198], which we call the HTSG model. The 
derivation is quite complex and is based on Kirchhoff diffraction theory and wave 
optics, which we have not covered. A careful derivation of the model may be found 
in their original paper [198]. 

The model basically distinguishes three types of reflection: ideal specular (sp), 
uniform diffuse (ud), and directional diffuse (dd), as illustrated in Figure 15.14. 
These are each accounted for by a different part of the BRDF: 

fr = fr SP + fr Ud + fr dd (15.25) 

The complete expression for the BRDF for polarized light is given in their paper. 
We will content ourselves here with simply stating the formulas for the unpolarized 
case, which are daunting in their own right. For full details on the model, see [198]. 

The model is based on two parameters. The first is cr 0 > the RMS roughness of 
the surface, which was symbolized by m in the Cook-Torrance model. The second 
parameter is r, the autocorrelation length ; this is a measure of the distance between 
peaks on the surface. The ratio cro/r is proportional to the RMS slope of the surface. 

Given these parameters, and the geometry in Figure 15.15, the BRDF f r predicted 
by the HTSG model is given by the expressions in Equation 15.26. We present these 
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The geometry for the HTSG model. Redrawn from He et al. in Computer Graphics (Proc. Siggraph 
*91), fig. 5, p. 178. 


equations primarily for reference, not exposition; full details on the derivation of 
this model and its unpolarized form are available in He et al. [198]. 


f r = f r *p +/ r ud + f r dd 

= x c*s*p 

7T COS 6i cos 6 r 

fr ud = o(A) 

p, = \G \ 2 xe~ 9 xS 
^ _ ( 1 if in specular cone 
10 otherwise 
\F\ 2 = (F 2 + F 2 )/2 
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<Hr)' 


|k r xk t | 4 

X [(s r x k <) 2 + (p r x kj) 2 ] x [(s, x k r ) 2 + (p { x k r ) 2 ] 

S = Si(6i) x Sr(6r) 

o /a \ 1 - ^erfc(rcot^i/2<To) 

Z>i{Vi) = -1 


Sr(e r ) = 


A(cot 6{) + 1 
1 - |erfc(r cot0 r /2<7 O ) 


A(cotfl) = - 


D = 


A (cot 6 r ) + 1 

2 <To c 

— 7 = x-- - erfc 

y/ 7 T T COt 6 

° % g m x e-* 


( T cot 6 V 

1 I^Ta 


V 


7T T‘ 


4A 2 ' m! x m 

m=l 


exp(—i; xy r 2 /4m) 


^ = [(27rcr/A)(cos^i + cos 0 r )] 
^0 


cr = 


\/l + (V<To) 2 
)f^ Zo = ~ 4 ^ Ki + K r )exp[-z 0 2 /2a 0 2 } 

Ki = tan ^ erfc coti ^ 

K r = tan0 r erfc cot0 r ^ 


(15.26) 


Equation 15.26 uses a few additional geometric terms not given in Figure 15.15; 
these are specified in Equation 15.27. 


v — k r — k t 
V X y = yjv x 2 + Vy 2 

ki x n 

S ' - |k { x n| 

Pi = Si x ki 
k r x n 
Sr ~ |k r x n| 

p r = S r X k r 


(15.27) 
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The error function erfc is defined by 

erfc(x) = 1 - 4= / e-‘ 2 dt (15.28) 

Jo 

although it is usually computed using a series expansion [3]. 

If we know the spectral reflectivity of a surface, and the values of ao and r that 
characterize its roughness, the HTSG model has been shown to predict experimental 
results on that material fairly well for a wide variety of materials and directions of 
incident light [198]. This is probably the most complete physically based model in 
the graphics literature today. 

It is important to note that although it has the power to represent a wide variety 
of anisotropic materials and their interaction with polarized light, the HTSG model 
is not normalized [469]. We still need to manually select an appropriate diffuse 
reflectivity component so that energy is conserved. 

The HTSG model is very expensive to compute. Precomputation and storage can 
move the computation into a one-time preprocessing step, but then storage of the 
data and accurate, efficient reconstruction become serious issues. He et al. report 
in a follow-up paper to their original presentation that a square table with eighty 
entries on a side can capture their model to within a relative error of 1%, and that 
using a spline representation to interpolate this table results in a speedup of two to 
three orders of magnitude over direct computation [197]. When a model such as this 
is able to pick up subtle and quickly changing aspects of the BRDF, it is a challenge 
to sample and reconstruct the function in a way that captures the power of the full 
formulation. 


15.6 Empirical Models 

The shading models presented above were based on a physical simulation of the 
underlying surface structure. Not all shading models require such physical under¬ 
pinnings. For some applications, the model need only generate an approximation of 
the right solution. Of course, if accurate simulation is the goal of a particular render¬ 
ing, then such shading models are inappropriate, but for fast previewing and some 
applications a fast, approximate, and easily controlled model may be preferable to 
the physically based ones described above. 


15.6.1 Hm Strauss Modal 

The Strauss model [424] is an almost entirely approximate model that is intended to 
give designers an intuitive set of parameters with which to control surface appear¬ 
ance. 
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noun 1 s. 1 6 

The geometry for the Strauss model. 


The model is based on five surface parameters: 

Color specifies the base color C of the object under white illumination at normal 
incidence. 

Smoothness is a scalar s e [0,1] sweeping the range from a perfectly diffuse surface 
at s = 0 to one that is perfectly specular at s = 1. This controls both the ratio 
of diffuse to specular reflectance and the size of the specular highlights. 

Metalness is a scalar m € [0,1] from 0 to 1 that specifies a point in the range from a 
dielectric at m = 0 to a metal at m — 1. 

Transparency is a scalar t e [0,1] that sweeps the range from opaque at t = 0 to 
transparent at t = 1. 

Index of refraction is a scalar n e 1Z specifying the simple index of refraction of the 
medium. 

The Strauss model is based on the geometry shown in Figure 15.16 (the names of 
vectors and angles have been changed from the original paper for consistency with 
the other shading models in this chapter). Because the model approximates more 
complex surfaces with simpler functions, there are several internal parameters that 
require tuning to get the proper behavior; we will see these below. As always, though 
each term depends on wavelength, we will not write this explicitly. 
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The BRDF of the surface is written as the sum of a diffuse term and a specular 
term: 

fr = fr d + fr S (15.29) 

Taking the diffuse term first, it is written as a product of the Lambert diffuse 
reflection term N • S, the diffuse reflectivity and a diffuse adjustment factor d a : 

f r d = p d ( N • S )d a 
Pd = (1 - s 3 )(l - t) 

d a = 1 — ms (15.30) 

The diffuse reflectivity pa includes the term (1 — t) to account for diminished diffuse 
reflectance for increasingly transparent objects, and a term (1 — s 3 ) to account for 
less diffuse reflectance as the surface gets smoother. The exponent 3 was chosen 
empirically so that a linear change in smoothness would cause a linear change in the 
diffuse reflectivity. The adjustment factor d^ is used to reduce diffuse reflectivity for 
rough metals. 

The specular reflectance is based on a Phong-like exponentiated cosine term, 
times a specular adjustment factor s a : 

f r 3 = -(R • V) h s a s c 


s a = min(l, r„ + (r n + k b )b) 

r n = (1 — t) — f r d (15.31) 

The exponent h is determined by an empirical function that produced pleasing and 
predictable results. The adjustment factor s a is designed to simulate off-specular 
peaks and Fresnel reflection. It is based on r n , which is the fraction of light that is 
specularly reflected (that is, it is neither transmitted nor diffusely reflected). A bit of 
light given by kb is added to this to give the off-specular peak, and then the result 
is multiplied by b to account for Fresnel and local geometric terms; Strauss reported 
that kb = 0.1 worked well. The term s c is discussed below. 

The factor b increases the specularity near grazing incidence, except when the 
angles 6 or 6 come too close to 7r/2, when self-shadowing effects on the surface itself 
become important and reduce the reflectivity. The value of b is found from two 
functions, one named F designed to simulate the Fresnel term, and another named 
G to simulate the geometry term: 


b = F(0)G(6)G(6 ) 


( 15 . 32 ) 
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These functions are given by 


and 


G(x) = 


1 

1 

1 

<N 

1 

1 

kf 2 

1 

1 

(1 ~kf ) 2 

V 

1 

1 

(1 - kg ) 2 (X 

- kg ) 2 

1 

1 

(1 - kg ) 2 

kg 2 


(15.33) 


(15.34) 


The argument x in both functions should be from 0 to 1, so input angles are scaled by 
l/(7r/2) before computation. The constants kf and k g are used to tune the functions 
to better approximate the Fresnel and geometry terms; Strauss suggests kf = 1.12 
and k g = 1.01. 

Finally, some color shifting typically occurs on highlights off of metals, a phe¬ 
nomenon that is naturally accounted for in other models through the wavelength 
dependency in the Fresnel term. In this model, the color at a given wavelength is 
modulated by the term s c in Equation 15.31, which is given by 


c* = i + m (l - F(0))(C( A) - 1) (15.35) 

where C( A) is the color at the wavelength A where the model is being evaluated. 

This is truly a model that has been seasoned to taste by an experienced cook. 
There are numerous magical constants that have been chosen empirically to produce 
a particular type of behavior. But Strauss reports [424] that the final model is intuitive 
and inexpensive, and because there are only a few user parameters that are relatively 
decoupled, it is straightforward to design and modify surface behaviors. 


15.6.2 Th* Word Mod+I 

Another simple empirical model with a different spirit was proposed by Ward [469]. 
He desired a shading model that was both simple enough to be attractive to imple¬ 
mentors, and simultaneously accurate for most materials. When discussing how to 
create such a model, Ward proposed finding “the simplest empirical formula that 
will do the job” [469]. 

Ward derived an isotropic reflectance model that leaves out the geometric and 
Fresnel terms we have seen above. He asserts that these terms are difficult to integrate 
and tend to cancel each other out, and may be reasonably replaced with a single term 
that will normalize the BRDF. The geometry for the Ward isotropic model is the same 
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as for the Strauss model. The BRDF generated by this model is given by 


^0 = P_ +p s 
7T 


VcosOcosS 


exp[— tan 2 7 /a 2 ) 
2 7ra 2 


(15.36) 


where 

p d is the diffuse reflectance, 
p s is the specular reflectance, 

0 is the angle between N and S, 

S is the angle between N and V, 

7 is the angle between N and H, and 

<j is the RMS standard deviation of the surface slope. 

The terms p d and p s are functions of wavelength, and thus can incorporate Fresnel 
effects. 

Equation 15.36 is very similar to the Phong model, except that it is normalized; 
inspection of the formula shows that it is symmetric with respect to the incident and 
reflected angles 6 and 5, which is a required symmetry in any BRDF that satisfies 
reciprocity. The BRDF given in Equation 15.36 is normalized by the term l/47ra 2 . 
This is accurate as long as a < 0 . 2 ; beyond that point the surface becomes mostly 
diffuse and the specular term becomes less important. 

The model may be extended in a straightforward way to accommodate two per¬ 
pendicular and uncorrelated slope distributions, allowing us to describe an anisotrop- 
ically reflecting surface. The anisotropic form of Ward’s model is 


P= — +P 

7T 


1 


exp[— tan 2 7 (cos 2 (j)/(T x 2 -I- sin 2 (f>/(T y 2 )\ 


y/cos0cos6 
where in addition to the terms above. 


2lT@ X (7y 




a x is the RMS slope in the x direction, 

(jy is the RMS slope in the y direction, and 

<\> is the azimuth angle of S projected into the tangent plane. 

Given the importance of efficient evaluation, this formula may be rewritten in a 
more computationally efficient approximation: 


P=—+P* 
7r 


1 


1 


V cos 0 cos 6 ^7ra x a y 


exp 


-2 


[(H-X)/g x ) 2 + [(H-Y)/(T y ] a 


1 + (H • N) 


(15.38) 


As long as p s + pd < 1 and cr x ,cr y < 0 . 2 , this model has been shown to match 
measured data rather well [469]. 
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This model is described by four parameters that have reasonable physical mean¬ 
ing: pd and p s are the diffuse and directional-diffuse (or rough specular) coefficients, 
and a# and represent the RMS slope distribution of the surface in the X and Y 
directions. 

Figure 15.17 (color plate) shows a set of three chairs: one a real photograph, one 
rendered with the isotropic model, and one rendered with the anisotropic model. 
Note particularly the highlight on the seat and the reflection of the back wall on the 
seat. 


15.6.3 Iht Progra»MabU Mo«UI 

Each of the methods discussed above began with some approximation of a physical 
surface. Its behavior was characterized by a set of mathematical equations, which 
could then be implemented in a program. This approach creates a set of parameter¬ 
ized shading models , which essentially requires you to choose an appropriate model 
and then parameters for each material. To be successful, a user must have an inti¬ 
mate knowledge of both the model and the material in order to select the appropriate 
shading model and properly tune the parameters. Intuitive methods like the Strauss 
model relieve some of the burden, but the user then has a relatively narrow range of 
materials from which to choose, compared to all possible shading methods. 

Rather than use one single shading model, or even a small set of models, to 
describe all surfaces, we could instead write a custom shading model, or shader , for 
each surface. Since a shader is just a program, it could in theory implement any 
of the models above, but also present enough flexibility to support new kinds of 
models. 

To be useful in practice, the shader may be written in a shading language that 
supports the high-level constructs that are useful for this task. Cook presented an 
architecture [100] for building customized shaders by writing them in the form of 
tree-shaped networks. A similar language-based approach to shading was presented 
by Perlin [338], who allowed arbitrary expressions in a simple language to generate 
pictures from precomputed visibility information. 

These ideas were combined by Hanrahan and Lawson [1821 into the Render- 
Man shading language, which was intended to be simple enough that many people 
would write their own shaders, yet powerful enough to produce accurate (or at least 
accurate-looking) results. The resulting language is described in detail in Upstill 
[446]. 

Although they can be powerful enough to include the physically based models 
described above, the real attraction of shading languages is in their flexibility in 
creating interesting or complex materials without excessive effort. Often these ma¬ 
terials are tuned to appear realistic in some sense, but just as often they are intended 
instead to create abstract or representational materials. 
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In some sense all shading is procedural (or programmable), since it always ends 
up implemented as a computer program. The explicitly programmable approach is 
based on the idea of taking small building blocks of geometric and environmental 
primitives and combining them to make a complete surface representation. Often 
normalization, reciprocity, and other physical criteria that are very important to 
physically based shaders are ignored when writing a procedural shader. This allows 
the shader to have an enormous amount of flexibility, including the local simu¬ 
lation of shadows (rather than computing shadows by explicitly interrogating the 
environment) and the introduction of atmospheric and camera effects. 


15.7 Precomputed BRDF 

The physically based and procedural reflection functions discussed above can be 
expensive to evaluate. Even simple scenes can require several million evaluations of 
the reflection function, so it is important that evaluation be as fast as possible. One 
way to speed up a slow function is to precompute a range of values and store them 
in a table, and then interpolate those values at run time. 

There is another significant advantage to precomputing the BRDF and storing it 
in sampled form. Physical models attempt to find a description of the BRDF that is 
both an accurate representation of the model and mathematically tractable. But if 
we are sampling the BRDF and storing it, then the underlying physical model can be 
arbitrary. This is powerful, because it lets us use brute-force techniques to manually 
construct the underlying small-scale geometry and turn that into a BRDF, even if 
the mathematical representation would be difficult. For example, the microfacet- 
distribution equations discussed above use the Gaussian model and characterize the 
distribution of facets by their RMS slope under this model. But if we are sampling the 
facets directly, then we can manually orient them any way we want, not necessarily 
according to a Gaussian distribution. This can make it easier to capture models 
for which we have some understanding (either intuitive or experimental), but not a 
precise description, let alone one that is tractable, fast, and easy enough to integrate 
and normalize. 


1 3.7.1 Sanpled Htnlfphtrts 

The sample-and-store approach was used by Kajiya [233] to efficiently implement 
his anisotropic reflectance model. Because this formula contains an integral that 
is expensive to compute, Kajiya suggested precomputing the reflection for different 
types of surfaces and storing the results in a set of coarsely sampled hemispheres. 
The idea is that the BRDF for a given direction is represented by a hemisphere of 
m cells; if we simply subdivide the hemisphere into a cells by latitude and b by 
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Lining up a precomputed anisotropic BRDF with a shading point. 

longitude, then m = ab . We then consider each of these m cells in turn, thinking of it 
as the source of incident light. For each incoming cell we compute the reflection into 
every outgoing cell, filling up that outgoing hemisphere for that incident direction. 
Then the next incident direction cell is considered, and so on; the result is a set of m 
hemispheres. 

To apply the model, we choose the hemisphere closest to the incident angle, and 
then orient the hemisphere so its north pole lines up with the surface normal. Then 
the hemisphere is rotated around the normal until the X axis on its base plane is 
aligned with the grain of the surface, as shown in Figure 15.18. 

Given a precomputed set of hemispheres, the rendering step only requires using 
the incident light and the surface information to select the appropriate hemisphere, 
line it up, and then perform a set of table look-ups to find the reflected light. The 
trick then is to determine how to line up the hemisphere. If we can easily find 
partial derivatives of the surface, then those partials with respect to some reference 
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coordinate system can be used to generate a local tangent plane, and one of the 
partials can be used as the grain direction. Unfortunately, finding these partials can 
be difficult or expensive. 

Kajiya suggested that just as texture mapping (discussed below) can be used to 
apply colors and other surface characteristics to a surface, so an entire frame can be 
mapped to a surface. A frame is an origin and a set of three orthogonal vectors: the 
normal N, the tangent T, and the binormal B. The normal is perpendicular to the 
surface, and the tangent and binormal span the tangent plane. A frame also needs 
an origin, but that’s simply the shading point itself. So by mapping the frame to the 
surface, we get a complete 3D coordinate system at each point. 

This can be done by computing a three-by-three rotation matrix for the point. We 
can store a coarsely sampled set of rotations on the surface and interpolate as with 
a regular texture map (interpolating the rotations as quaternions [404] is probably 
a good idea), or the rotations can be computed on the fly from other information. 

Since the vectors are mutually orthogonal, any one can be computed from the 
other two. The surface normal can usually be derived directly from the surface 
information itself. So if the normal can be computed unambiguously, and some 
deterministic rule generates one of the other vectors, then the third vector can be 
generated from a cross product producing a complete frame. 

One way to produce a second vector is to select one of the principal axes, and 
find the vector nearest that axis that is perpendicular to the normal. For stability, 
we can choose the axis that makes the greatest angle with the normal; this is the 
normal component with the smallest absolute magnitude. Suppose this is the X 
axis (which corresponds to the vector X). Then we find the perpendicular T from 
T = X - (X • N)N, and then B = N x T. 

The mapping process then needs only a single number representing the angle by 
which that frame should be rotated around the normal to line it up with the grain. If 
the surface is smooth enough to allow a good representation by a uniform sampling 
(or a nonuniform texture map is used), then only a single scalar texture is needed. 
However, this technique is not invariant as the surface is moved, because it depends 
on the local orientation of the normal with respect to the global coordinate system. 
If we can find the inverse transformation of the canonical object with respect to 
the global system, this can be applied to the normal before the mapping and then 
reapplied to the complete frame to find the local transformed frame on the surface. 
If the object is rigid, then a single matrix will do for the entire surface; if free¬ 
form deformations are allowed, then the technique becomes more expensive, and 
eventually storing enough information to generate the local frame directly becomes 
more attractive. This time-space trade-off must be settled by the implementor based 
on practical considerations. 

A related sampled-hemisphere approach was described by Becker and Max [35], 
who stored a BRDF using a normal distribution . This is a data structure based on 
a hemisphere that has been divided into a finite number of bins, and then placed 
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around a small patch of material. Each hemispherical cell contains a count of the 
number of normals that project into that cell from the material sample; initially all 
of these counts are set to 0. Points on the material are sampled, and the normal at 
each point is computed. The value stored in the hemisphere bin corresponding to 
that normal’s direction is then incremented. Becker and Max take care not to count 
normals from points that would be invisible along the normal direction from outside 
the hemisphere; that is, they account for a geometric attenuation factor similar to 
G in the Torrance-Sparrow model by explicit calculation. Becker and Max present 
an algorithm for transforming the normal table into a pair of BRDFs, one each 
for the specular and diffuse components. These tables are then interpolated during 
rendering. 


15*7.2 Spherical Harmonics 

A different approach using spherical harmonics was developed by Cabral et al. [71]. 
They wanted to represent the BRDF generated by a surface of explicit microfacets 
that they created to improve bump mapping. They irradiated a small sample of the 
opaque surface and counted where the light was reflected for each incident direction, 
simulating the experimental measurement of a BRDF. 

Assuming that the BRDF was isotropic, they realized that it was simply a function 
defined over a hemisphere, and therefore could be represented in the spherical har¬ 
monic basis (recall that spherical harmonics were discussed in Section 13.9). They 
noted that by truncating the spherical harmonic expansion after some finite number 
of terms, they could efficiently store and compute with the measured BRDF. 

Kajiya and Von Herzen also represented the BRDF around a volumetric scatter¬ 
ing point using spherical harmonics [236]. They solved a set of first-order partial 
differential equations for the coefficients, and saved only the first few terms to store 
an approximation to the full function. 

Spherical harmonics were also used by Sillion et al. [410] to store local illumi¬ 
nation information. They noted that this formulation also allowed them to easily 
accumulate and accurately store light information as it arrived at a surface. Like 
Cabral et al. [71], they assumed that the BRDF was isotropic. As with practical 
implementations, they also truncated the expansion after a finite number of terms. 
To store the BRDF as a function of the incident angle 0, Sillion et al. stored each 
coefficient in the spherical harmonic expansion in a spline as a function of this angle. 
Thus for a ray of incident light that makes an angle 6 with the normal to the patch, 
they evaluated the spline for the first coefficient at this angle, then the spline for 
the second coefficient, and so on, eventually building up a list of all the coefficients. 
These were then used to weight the spherical harmonic bases to compute the full 
BRDF. 

Sillion et al. noted that this technique allows us to store smooth BRDFs with 
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only a finite number of samples (as opposed to the sampled hemispheres mentioned 
above), but by the same token this method can only represent reasonably smooth 
functions. A BRDF with a significant specular component has a sharp peak that 
violates this condition, so they handle that with a separate mechanism. 

Spherical harmonics were used to store more complex anisotropic BRDFs by 
Westin et al. [475]. In this approach, a spherical harmonic expansion was used to 
store the spherical harmonic coefficients themselves for the BRDF, a process they 
described colorfully as placing “wheels within wheels.” 

The idea is based on the multiple projection of a signal (the 4D anisotropic BRDF, 
which depends on the two scalar angles each of incidence and reflection) onto a set 
of lower-dimensional bases (the 2D spherical harmonics), and then projecting the 
resulting coefficients onto the same basis set (the 2D spherical harmonics again). 
This is an instance of a general technique that allows us to efficiently represent 
multidimensional data using low-dimensional basis functions; this was discussed in 
detail in Section 5.3. 

The matrix of spherical harmonic coefficients is normally extremely large; Westin 
et al. say that ten thousand elements are typical. To compress the data, they use a 
clever observation also used by Sillion et al. [408]: when representing only the top 
half of the sphere to capture reflection, we are free to choose any signal we want 
for the lower half (and vice versa when representing transparency). Suppose that 
we simply mirror the BRDF about the equator; then all the spherical harmonics that 
are odd with respect to the equator (that is, f(0 , ip) = -/(0, -ip)) will evaluate to 0. 
Like the ID Fourier exponentials about the origin, half of the spherical harmonics 
are odd about the equator, and since we are forming the products of all the spherical 
harmonics used, one-fourth of the terms in the matrix are the product of two odd 
functions and are zero, and one-half of the terms are the product of an even and 
an odd function and therefore they too are zero. This cuts down on the size of the 
matrix to one-fourth of its original size. Westin et al. also use a reversible modulation 
of the BRDF near the equator to reduce high-frequency information, which causes 
the magnitudes of the spherical harmonics to drop off faster, thereby letting them 
retain fewer terms. 

Westin et al. estimate the coefficients by a Monte Carlo sampling of an explicitly 
constructed surface patch, similar to the approaches discussed earlier. They then 
numerically process the matrix to force it to be symmetrical. Results of this approach 
may be seen in Figures 15.19 and 15.20 (color plates), illustrating the anisotropic 
surfaces of velvet and nylon. 


1 5.8 Volume Shading 

The shaders we discussed above were mostly concerned with surface shading , which 
occurs at the surface of an object, and not with volume shading , or the absorption 
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and scattering of light inside the object itself. There isn’t a firm distinction between 
these two methods, and often one can be pressed into the service of the other; after 
all, they’re both just scattering functions. In fact, most volumetric models include 
a surface component to handle the effects at the boundary of the volume where the 
light passes from one medium to another. 

But some shading models have been developed that explicitly represent what 
occurs inside a material, and we now turn our attention to those methods. 


13.8.1 Phase Functions 

The core of any shading model is the scattering function. There are a few scattering 
functions that have been developed for efficient volume rendering that have a solid 
theoretical basis and span a wide range of materials. 

In this section we will survey the most popular of these scattering functions (also 
called phase functions when applied to volumetric scattering). We will not derive 
these functions, as the theory can be quite lengthy and complex. Derivations may be 
found in the standard references in optics listed in the section on Further Reading, 
and in references such as Bohren and Huffman [53], Denman et al. [122], and 
McCartney [292]. 

Each scattering function provides the fraction of incident light propagated from 
a scattering event as a function of the outgoing direction with respect to the incident 
direction. All of the volume scattering functions we will cover in this section are 
isotropic , and depend only on the angle a between the incident and outgoing direc¬ 
tions, as in Figure 15.21. Phase functions are often specified in terms of a = cos a 
rather than a itself, and we will use that convention here. 

Most scattering functions are based on the idea of a suspension of particles 
in some medium ; both the particles and the medium are usually assumed to be 
independently homogeneous. Usually, the particles are also assumed to be of equal 
size (or distributed in size in some predictable way), and uniformly distributed within 
the medium. Although the index of refraction of both the particle and the medium 
of course depends on wavelength, and thus influences the scattering of light, the size 
of the particles exhibits more of an effect on the phase function than the change in 
index of refraction [506]. 

Because of this dependence on particle size, the choice of which type of scattering 
function to use for a particular problem is usually determined by the ratio of the 
particle size to the wavelength of the light involved. Rather than switch between 
models throughout the visible band, we can simply use the scotopic peak wavelength 
of 555 nm as a characteristic wavelength and select a scattering model using that 
value. Because the particles also vary in size and shape, we need to choose charac¬ 
teristic values for these particle parameters. Often the particles are assumed to be 
spherical, with a characteristic radius r given by the average radius. If the wave- 
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The angle a at the generic scattering event. 


r « A 

Atmospheric absorption 

r < A 

Rayleigh scattering 

r % A 

Mie scattering 

r » A 

Geometrical optics 


TABU 15.1 

Criterion for selecting a phase function, comparing the characteristic particle radius r with the 
characteristic light wavelength A. 


length or particle characteristic assumptions are violated, the accuracy of the model 
will be decreased. When this is unacceptable, multiple simulations may be run using 
different sets of parameters. 

Inakage has presented a useful summary of the four principal classes of phase 
functions and their selection criteria [225]; this summary is listed in Table 15.1. 

Table 15.1 tells us that when the particles are far smaller than the wavelength 
of light, there is no appreciable scattering, and the light is simply absorbed. This is 
what happens to light passing through the near-vacuum of space. When the particles 
are much bigger than the wavelength of light, geometrical optics come into play; this 
is the typical situation when light strikes most solid objects such as wood or metal. 







MOURI 19.22 


(a) The Rayleigh phase function, (b) The function in (a) rotated around the incident vector. 


The intermediate cases occur when the particle sizes are comparable to the wave¬ 
length of light. 

When the particles are somewhat smaller than the wavelength of light, we see 
phenomena described by Rayleigh scattering ; such particles include cigarette smoke 
and dust. Klassen suggests that Rayleigh scattering should be initiated when rf A < 
0.05 [247], The Rayleigh scattering function is given by 

P*(a) = jj(l + a 2 ) (15.39) 

The Rayleigh scattering function is simple enough that it can be directly imple¬ 
mented and used without approximation. The Rayleigh function is plotted in Fig¬ 
ure 15.22(a). Remember that this function is isotropic around the incident direction, 
so the full 3D function is a surface of revolution found by rotating this curve around 
the central axis, as in Figure 15.22(b). 

When particles are comparable to the wavelength of light, such as for water 
droplets or fog, the more complex theory of Mie scattering becomes applicable. 
Nishita et al. have reported [320] from the optical literature that the expensive Mie 
scattering functions may be efficiently approximated for sparse and dense parti¬ 
cle densities, called hazy and murky , respectively. The hazy Mie and murky Mie 
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FI0UII 15.23 

(a) The hazy Mie phase function, (b) The murky Mie phase function. 


approximations are given by 


Phm(o) = 1 + 9 ^ 

/ 1 , \ 32 

/Wa) = l + 5of-y^J (15.40) 

These two phase functions are shown in Figure 15.23. 

Note that the scattering functions in Figure 15.23 look like ellipses (albeit with a 
small blip at one end). Another popular approximation to the Mie functions is given 
by the Henyey-Greenstein phase function [49]: 

p HG(.9i a ) = (i _ 2ga+ 2 ff 2 ) 15 (15 - 41) 

The Henyey-Greenstein phase function produces an ellipse with eccentricity g , and 
one focus at the origin. By varying g , we can sweep from predominantly backscat- 
tering for g > 0, to uniform scattering at g = 0, to predominantly forward scattering 
for g < 0. The Henyey-Greenstein phase function is plotted in Figure 15.24 for 
a few different values of g ; compare it to the hazy and murky Mie functions in 
Figure 15.23. 

Observe that the Rayleigh function appears to be two elliptical lobes brought 
together, and that the small lobe on the hazy and murky Mie functions are also 
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FI0URI IS. 24 

The Henyey-Greenstein phase function plotted for g = —0.6, —0.25,0,0.25,0.6. 


elliptical. This suggests that a function that combines two ellipses should be able to 
match these functions pretty closely. 

The two-term Henyey-Greenstein (TTHG) model takes just this approach [506]. 
It linearly combines two ellipses of different magnitudes and eccentricities: 


Ptthg{t, gu 92 , a) = r 




(1 — 2#ia + <7i 2 ) 1,5 


+ (1 ~ r ) 


1 ~92 


(1 — 2</ 2 a + 92 2 ) 1 ' 5 


(15.42) 


Schlick [380] has developed a phase function that approximates the TTHG model, 
and thus can match both Rayleigh and Mie scattering as well. The model has the 
important advantage of avoiding the expensive fractional exponentiation in the full 
TTHG model. Like the TTHG formula, the Schlick phase function depends on 
a, the cosine of the scattering angle, a blending parameter r, and two eccentricity 
parameters g\ and < 72 * 

Ps(r, 9i , 92, a) =r + (1 - r) 

Figure 15.25 compares the Schlick phase function with the Rayleigh, hazy Mie, 
murky Mie, and Henyey-Greenstein functions. The values used for the match to the 
first three are given in Blasi et al. [45] and summarized in Table 15.2. To match the 
one-term Henyey-Greenstein function, we set r = 1 and g 2 = 0. 

The Schlick function has the advantage of simplicity (and therefore speed), and 
the fact that we can invert it to find a probability density function appropriate for 
sampling the function. Given a uniform real random variable u e [0,1], the pdf 
corresponding to one lobe of the Schlick function is given by Schlick [380]: 


2u -f- g — 1 
2 gu - g 4-1 


(15.44) 


This is very useful when using Monte Carlo methods to simulate scattering events. 
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M8UII 1 5.25 

The Schlick phase function and (a) the Rayleigh function (dashed), (b) the hazy Mie function func¬ 
tion (dashed), (c) the murky Mie function (dashed), (d) the Henyey-Greenstein function (dashed). 
The Henyey-Greenstein function is plotted for p = —0.6, —0.5, —0.25, and the Schlick function 
for pi = -0.667, -0.5 -0.55. 


Function 

r 

91 

92 

Rayleigh 

0.50 

-0.46 

0.46 

Hazy Mie 

0.12 

-0.50 

0.70 

Murky Mie 

0.19 

-0.65 

0.91 


TABLI 15.2 

The values of r, pi, and p 2 in the Schlick phase function that match the Rayleigh and two Mie 
functions. Source: Data from Blasi et al. in Proc. Eurographics *93. 
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Three other useful scattering functions were discussed by Blinn [48]. The constant 
function Pc is just a single scalar for all angles: 

Pc = 1 (15.45) 

The simple anisotropic function Psi is a function of the weighted cosine of the angle 
and a constant term: 

Psi{w,a) = 1 + wa (15.46) 

for some real number w. Finally, the Lambert function Pi is given by integrating 
the brightness of a visible disk: 

Pl{o) = [sina + (7r - a)a] (15.47) 

07T 


where a = cos -1 a. 

A summary of how to choose among the different scattering functions in a variety 
of situations is given in Figure 15.26. 


15.8.2 Atmospheric Modeling 

An important special case of volume shading is simulation of the atmosphere of Earth. 
Much work has been done in computer graphics to develop efficient algorithms for 
rendering Earth’s atmosphere and atmospheric effects. Our interest here is primarily 
in the scattering functions and not in techniques for evaluating them in various 
situations; references on these important algorithmic methods may be found in the 
section on Further Reading. 

The components of the atmosphere that most strongly affect light from the sun 
(or solar radiation) are four permanent gases—nitrogen, oxygen, carbon dioxide, 
and argon—along with aerosol, ozone, and water vapor. Explicit formulas may be 
written for the absorption and scattering of light by each of these components per 
unit length; a summary of such formulas may be found in Zibordi and Voss [506]. 

An analysis of light absorption and scattering through atmosphere over the 
ground has been given by Inakage [225] and is illustrated in Figure 15.27. Of 
the solar radiation arriving at the outer edge of the atmosphere, 29% of the radi¬ 
ation can be thought of as interacting with the atmosphere, 47% with clouds, and 
the remaining 24% directly with the ground. Taking these in turn, 17% of the 
incident radiation is absorbed by the atmosphere itself, primarily by the gases and 
other components listed above, and 12% of this arriving radiation is scattered. Half 
of this radiation is backscattered and returns to space, and half is forward scattered 
and continues on to the ground. Returning to the original solar radiation, we said 
that 47% interacts with clouds. A total of 3% of the incident radiation is absorbed 
by clouds, 20% is reflected back into space, and 24% continues downward to the 
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MOURI 13.27 

The distribution of light in the atmosphere over the ground. Redrawn from Inakage, The Visual 
Computer , fig. 1, p. 105. 


earth. The third category is the 24% of the solar radiation that makes it directly to 
the ground without interacting with the atmosphere or clouds; 20% of the incident 
radiation is directly absorbed by the ground, while 4% is reflected back into space. 

The actual distribution of light due to Rayleigh scattering is generally inversely 
dependent on the fourth power of the wavelength of the light. Two different models 
have been proposed to simulate this scattering in the atmosphere for graphics. 

Inakage [225] has suggested modeling Rayleigh scattering with the equation 


Pn(a) = 


7r(l + cos a) 
A* 


•N 


W-y)V 

7]d 


l2 


(15.48) 


where 

7] is the simple index of refraction of air, 
rf is the simple index of refraction of the scattering particles, 

N is the number of particles per cubic centimeter, 

V is the volume of each scattering particle in cubic centimeters, 
d is the distance from the scattering event to the viewer, and 
A is the wavelength of light. 





15.8 Volume Shading 


767 


Note that cos a = cos 2 a. A similar equation has been suggested by Nishita et al. 
[323]: 


PM = - .)’ (15.49) 

where in addition to the terms above, N s is the molecular number density of the 
standard atmosphere, and p is an altitude-dependent function given by 

p = e ~\ h l H °\ (15.50) 


where Ho = 7,994 meters (at sea level, p = 1). They model the attenuation of the 
light as an extinction per unit length /?, given by 


8 ;r 3 (n 2 - l) 2 
3N S \ 4 


(15.51) 


The results obtained by Nishita et al. [323] are shown in Figure 15.28 (color 
plate). In Figure 15.28(a), we see the atmosphere rendered without a planetary 
model; in Figure 15.28(b), the Earth and clouds have been added. Notice the change 
in color near the edges of the shadow. 

Nishita et al. have reported [323] on a different approximate Mie phase function 
that matches experimental data for the atmosphere better than the one-term Henyey- 
Greenstein function. This better fit comes at a significantly increased evaluation cost. 
The Cornette function is given by 


Pc(a,g) 


3(1 — 9 2 ) 1 + a 2 

2(2 -bp 2 ) (1 — 2ga + a 2 ) 1 - 5 


where g is given by 


g = 





-IT” 2 ) ^ 


+ * 1 / 3 


and x is given by 


(15.52) 


(15.53) 


5 125 3 

x = -u H- u + 

9 729 


[64 325 „ 

V 27 243 


1250 
2187 1 


(15.54) 
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T 

; 

91 

92 

0.962 


0.759 

0.968 


0.537 


TAB LI IS.3 

Parameters for the two-term Henyey-Greenstein model for Earth atmospheric simulation. Source: 
Data from Zibordi and Voss in Remote Sensing of the Environment , p. 357. 


where u specifies the atmospheric condition and ranges from 0.7 to 0.85. Unfor¬ 
tunately, this function requires many more operations than the one-term Henyey- 
Greenstein function, including at least one new cube root. 

Zibordi and Voss [506] found that when using the two-term Henyey-Greenstein 
model to represent atmospheric scattering, the accuracy of an atmospheric shading 
model is highly sensitive to the eccentricity factors g\ and g 2 . They found good 
agreement to measured data from a clear sky by using either of the two sets of 
parameters listed in Table 15.3. 

A single-scattering model of a homogeneous volume was used by Max [286] to 
simulate the glow of illuminated air. He demonstrated the shafts of light visible near 
the ground as light passes through the leaves of overhead trees. 

Other planetary bodies have been studied as well. Blinn [48] noted that the 
reflectance of the moon may be modeled by combining a Lambert term for the 
backscattering due to rough particles with a simple anisotropic term for the forward 
scattering due to glass-like spherical particles: 


^Moon(°) = w i^( a ) + w 2Psi(a) (15.55) 


Blinn [48] has noted that a two-term Henyey-Greenstein function may be used 
to model the rings of Saturn as seen from the Earth, with r = .596, g\ = 0.5 and 
#2 = —0.5. For his simulation, Blinn chose a combination of a Lambert terms and a 
one-term Henyey-Greenstein function, using coefficients that vary with the radius r 
from the planet’s center: 


P Saturn( a ) = Wi(r)P L (a) + w 2 {r)P HG {- 0.5, a) 


(15.56) 
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FIOURI IS. 29 

The two-layer model used to model the ocean. Redrawn from Nishita et al. in Computer Graphics 
(Proc. Siggraph *93), fig. 5, p. 179. 


15.8.3 The Earth's Ocean 

Another significant natural object is the ocean that covers much of the Earth. Nishita 
et al. [323] have developed a shading model tuned for the ocean based on a two-layer 
model illustrated in Figure 15.29. 

The scattering model is based on a combination of the light scattered at the 
ocean’s surface, plus light scattered back up to the surface from inside the water 
and then transmitted back into the air. This quasi-single-scattering model is quite 
complex and uses several variables we haven’t encountered yet; to reduce confusion, 
we will explicitly write out all of the wavelength dependencies. The light arrives at 
the water at point S and scattered light departs from point P. 

Ti^OiJToW^Ojo)^ A) 
p I*? n 2 {cos0i O + cosOji)c(X)[l — u;o(A)F(A)] 


x (1 — exp[—zc(A)[l - u;o(A)F(A)](sec0j; -hsec^ 0 )]) (15.57) 
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where 

A is wavelength, 

2 is the depth of the sea, 

On is the angle between the normal at P and the viewing direction, 

0 io is the angle between the direction of the zenith and the incident light, 

0j o is the angle between the reverse direction of the zenith and the refracted light, 
n is the simple index of refraction of the water, 

Ti is the transmittance at point 5, 

T 0 is the transmittance at point P, 
c(A) is the attenuation of light per unit length, 

0 is the volume scattering function, 
ljq is the albedo of water; and 

F is the fraction of the scattering coefficient in the forward direction. 

15.8.4 TIm Kvbflka-Mrak PifpmMt Model 

One common type of material used for covering surfaces is paint. Paint is physically 
composed of many small colored particles of pigment suspended in some sort of 
mostly transparent and colorless base such as oil [444]. 

Materials that exhibit selective reflection and selective absorption are called pig¬ 
ments and dyes when there is little luminescence produced due to excitation. The 
nature of a pigment may vary with absorption. Some materials change their reflec¬ 
tive properties in response to strong irradiation, in a process known as solarization . 
For example, the almost-clear compound cubic potassium chloride can be made to 
tenebresce , or darken and bleach from exposure to strong light [267]. Sometimes 
this effect can be reversed. A material that responds to incident light by getting 
darker is called a scotophor (meaning dark-bearer ), in contrast to the better-known 
term phosphor (light-bearer) [267]. 

Usually paint is formulated so that it stays wet during storage but dries upon 
application, either through evaporation of part of the base material, or as a result 
of a chemical interaction with air. An analysis of the chemistry of paint shows that 
the color is usually generated by one of a handful of molecular structures attached 
to some larger carrier molecule [444]. 

One approach to handling paint is to directly simulate the atomic and molecular 
interaction of these chromophores (meaning color-bearer ). Alternatively, we can 
work at a macroscopic level and simply model the aggregate behavior of the paint 
with respect to incident light. This approach was taken by Kubelka and Munk, who 
developed a simple relationship between the scattering and absorption coefficients of 
paint and its overall reflectance [255]. The Kubelka-Munk theory has been discussed 
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FIOURI 15.30 

Paint on a surface. Redrawn from Haase and Meyer in ACM Transactions on Graphics, fig. 20, 
p. 332. 


for computer graphics applications in Fishkin [145] and Haase and Meyer [175]. 
We will rederive this model here because it has good practical value, it’s not very 
complicated and yet provides a very good match with the phenomena it models, and 
it presents another useful application of the transport theory we developed earlier. 
Perhaps the most important reason is that this gives us an example of a reasonably 
complex model that we can derive from first principles and then analyze in some 
detail. 

We start by imagining a surface that has been coated by a layer of paint with 
thickness x, as in Figure 15.30. We suppose that the paint is homogeneous, has a 
scattering coefficient of a s and an absorption coefficient of <7 a , and has been applied 
with uniform thickness h . We assume that we know the reflectance po of the substrate 
material to which the paint has been applied. Remember that <j s , cr a , and p 0 are all 
functions of wavelength. The Kubelka-Munk literature often uses the letters K and 
S for a a and a s . 

Consider some differential horizontal slice of thickness dh within the paint. Label 
the flux that is descending toward the surface as $<*, and the upward-moving flux 
$ u (note that these can each be the result of multiple scattering events with the paint 
material). Then, given the reflectivity po of the substrate, we can find the reflectivity 
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Ph of the substrate with a layer of paint of thickness h by solving a transport problem 
very much like the ones in Chapter 12. This is the approach taken by the Kubelka- 
Munk theory. 

We will follow the derivation in Haase and Meyer [175], which uses arguments 
similar to those in Chapter 12. We begin by noting that the loss in the descending 
and ascending fluxes due to a single scattering or absorption event is given by 

A<r = a + a s )$ d dh 

A u ~ = (cT a + (T s )®u dh (15.58) 

On the other hand, the gains in each flux come from scattering alone. Assuming a 
single scattering event in the layer g£/i, the gains are 

A/ = dh 

A u + =a s <f> d dh (15.59) 


So the total loss in each direction is found by the loss minus the gain: 

d$ d = A d “ - A d + 

= (a a + a s )®d dh - ct s $ u dh 
£»„ = -[ A u "-A u + ] 

= -[(<7a + (J s )®u dh - a s $ d dh] (15.60) 


(The upward-moving quantity is negated so that we can measure both changes in 
the same coordinate system, and d$ u is in the opposite direction as d$ d .) 

Writing a = 1 + (cr a /<r s ), we have the pair of differential equations 


d$ d 

dh 

-d$ u 

ct a dh 


= a$ d - $ u 
= a$ u - <& d 


(15.61) 


Multiplying by $ M /$ U and respectively, and adding the results, we find 


d$ d - $ d d$ u 

(7 a dh 


= a$ d $ u - $ u 2 + a$ d $ u - $ d 


(15.62) 


and then multiplying both sides by —we find 


$dd$u ~ d$d _ —2q$ u /$u \ 2 + l 
<^d 2 ^s dh $d \$d) 

From the Quotient Rule we can observe that 

$d d&u ~ d$d _ d($u/&d) 

$d 2 cr s dh cr s dh 


(15.63) 


(15.64) 
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Writing r = $ u /$d, we can simplify Equation 15.63 to 


dr 
a dh 


= r 2 — 2 ar + 1 


or by rearrangement and integration, 

/ r 2 — 2ar + 1 = J dh = 


(15.65) 


(15.66) 


Since we have assumed the paint is homogeneous, the scattering coefficient a s is 
constant throughout the material and can be brought outside the integral on the 
right-hand side. 

Our goal is to find a value for ph given a paint thickness h. When h = 0, the 
paint is gone, and we are left with the substrate reflectivity po« So we are interested 
in evaluating the integral of Equation 15.66 over the range po to ph- We cannot 
directly find the integral, but we note that it is a rational fraction , that is, the ratio of 
two polynomials in r ; the numerator is just a constant. To simplify such an integral, 
we factor the denominator by writing 6 = y/a 2 — 1. Then a bit of algebra lets us 
write the denominator as the product 


r 2 — 2 ar + 1 = [r — (a + 6)][r — (a — 6)] 

Using the method of partial fractions, we write 

dr _ Ai A 2 

r 2 — 2 ar -hi r — (a -h b) r — (a — b) 


(15.67) 


(15.68) 


By plugging in r = a —b and r = a -h b and simplifying, we find that A\ = 1/26 and 
A 2 = (—1/26), so plugging these in and integrating 


f ph dr _ f Ph l/2b f 

Jpo r 2 - 2ar + 1 J r - (a + 6) + J 


Ph _ 


1/26 


'Po ' 1 * Jpo ' -h 6) Jpo r ( a b) 

From any table of integrals (such as Beyer [41]), we find 

dr 


f 


r — c 


= ln(r — c) 


(15.69) 


(15.70) 


Applying this to Equation 15.69, we find 

^ [ln[p h - (a +6)] - ln[p 0 - (a + 6)] - ln[p h - (a - 6)] + ln[p 0 - (a - &)]] = <r s h (15.71) 


Multiplying both sides by 2 b and exponentiating, we find 


(Pfe-a-6)(p o-a + 6) 
(p 0 - a - b)(ph -a + b) 


= exp[26<7 5 /i] 


(15.72) 
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Recall that our goal is to find ph, the reflectivity of the substrate seen through a 
layer of paint of thickness h. Let’s assume that the paint is applied so thickly that 
the substrate is completely invisible. Mathematically, we write the thickness h -» oo, 
so po -» 0. We write the reflectance as p oo, or more simply as just p. This simplifies 
Equation 15.72 to 


(-a - b)(p - a + b) = 


(p-a -b 6)(-o + 6) 


exp [2 ba s h\ 

If we expand both sides and cancel common factors, we get 


As h -* oo, both of the fractions go to 0, leaving us with 

a 2 — pa — b 2 = bp 


(15.73) 


(“* - fa - ^ (‘ - exp[26g./t| ) ° V 0 + .xppL,ft| ) ,1S ' 74) 


(15.75) 


Solving for p, we find 

p = a-b = —~t (15.76) 

a b 

We can show (see Exercise 5) that this may be written in terms of the scattering 
coefficients as 

p=l + <r c - vV -(- 2<x c (15.77) 

where we use the combined coefficient a c = cr a /cror as 


o c 


(1 -P? 
2 P 


(15.78) 


The results of this theory are illustrated in Figure 15.31 (color plate). A real 
photograph is shown of a canvas painted with fourteen color swatches showing 
different amounts of combinations of two red paints with white. The spectral 
reflectances of the paints are shown in Figure 15.32, and the dependencies of the 
absorption and scattering coefficients are shown in Figure 15.33. Note how much 
more accurately the pigment model matches the real paints. 

Equation 15.77 represents the solution to the basic Kubelka-Munk differential 
equations as they were originally presented in 1931 [255]. Fishkin has described 
the evolution of these results through several years of improvements by a series 
of researchers [145]. We will not follow those developments in detail, but will 
summarize the main results. 

The Kubelka-Munk equations were generalized by Duncan to allow arbitrary 
mixtures of pigments [132]. He assumed that if there are multiple materials with 
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FI 0 If ■ I 11.33 

Spectra of the paints used in the real canvas. Redrawn from Haase and Meyer in ACM Transactions 
on Graphics , fig. 9, p. 318. 



FI 0 If ■ I 1 5.33 

Scattering coefficients as functions of wavelength of the paints used in the real canvas. Redrawn 
from Haase and Meyer in ACM Transactions on Graphics , fig. 10, p. 319. 
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different scattering and absorption coefficients, they may be combined by linear 
weighting: 


71 

~Wig** 

i— 1 


Oa 


n 





(15.79) 


Remember that these coefficients are both functions of wavelength. This hypothesis 
was not justified theoretically, but proposed and then confirmed by experiment. 

Kubelka [232] later solved the original differential equations of Equation 15.61 
for a finite thickness of paint. If the paint has thickness h , then 


\p{po - p)\ - p(Po - p)e°' h(l> p) 
Po ~ P ~ (Po ~ p)e°‘ h ^~^ 


(15.80) 


where p — 1/p, and as before p is the reflectance of a layer so thick that any increase 
in thickness doesn’t change the reflectance. 

A simpler form [145,256] of this equation can be found using the hyperbolic trig 
functions sinh, cosh, and coth. Then 


1 — Po{a — b cotli b(T s h) 
a — Po + bcothba^h 


(15.81) 


where 


a = 1 + (Ja/Os 

b= yj a 2 - 1 (15.82) 


as earlier. When the paint becomes thick enough to hide the substrate, po -4 0, so 
Equation 15.81 becomes 

Ph = —r , t (15.83) 

a + b coth ba s h 

and if the paint is infinitely thick, h —> oo, so coth ba s h —> 1, reducing Equation 15.83 
to 


Ph = 


1 


a + b 




(15.84) 


which is exactly Equation 15.76, showing that the infinite-thickness solution is just 
a special case of the more general finite-thickness solution. 

Fishkin points out a number of limitations to this theory of pigment modeling 
[145]; we list a few of the most significant ones here. 
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1 The values of o a and o s are dependent on the combination of pigment and 
medium; the same pigment in a different medium (say water and oil) will have 
different coefficients of scattering and absorption. 

2 The model completely ignores the chemical and electrical interactions between 
pigments, the medium in which they are suspended, and the substrate. Such 
interactions can substantially affect the chemical composition of the materials, 
and hence the resulting color. 

3 The model presented above assumes that the paint is homogeneous; in fact, 
the particles tend to flocculate (or clump), which again changes the color. 

4 The scattering assumptions in the transport theory were based on uniformly 
sized spherical particles. This is rarely the case for real pigment particles, 
which can take on cylindrical, bulletlike, or teardrop shapes of varying sizes. 

5 All of these models have ignored what happens to the light as it enters and 
exits the particle; this is an interface between two media like any other, which 
can involve reflection, transmission, and polarization effects. 

6 The models all assume that the substrate is planar and the paint is of uniform 
thickness; this is rarely the case in practice (we can use a texture map to 
modulate the paint thickness h to compensate for some of this effect). 

7 The model assumes that we are viewing the pigment from within a medium of 
the same index of refraction as the carrier, that is, while both the observer and 
the paint are immersed in a vat of oil or water. Clearly this is not the usual 
case on dry land, and some account must be made of reflection and refraction 
at the paint’s surface. 

Judd and Wyszecki [232] present a number of alternative solutions to the Kubelka- 
Munk theory. In addition to Equation 15.81, the most important are probably 


Rq = 


1 

a + 6coth ba s h 


(15.85) 


and 


where 


Ti = 


_ b _ 

a sinh ba s h + b cosh ba s h 


(15.86) 


Ro is the reflectance of a layer with an ideal black background 
Ti is the transmittance of a layer 

An interesting limiting case of these equations is when the scattering coefficient 
<j s goes to 0. Then a = (cr s + cr a )/cr s -* cr a /cr s , and b = (a 2 — l) 1 / 2 —>► a. Substituting 



(15.87) 


a = b = a a /a 3 into Equations 15.85 and 15.86 gives us 


and 


Ro = 


(T a {l + COth<7^/l) 


Ti = 


^g-(sinh a s h + cotlu7 s /i) 
1 


expfo/i] 


(15.88) 


These equations make the reasonable statements that as the scattering coefficient 
cr s -» 0, the reflectivity Ro of a layer over a pure black reflector also goes to zero. 
And the transmittance drops off as an equal percentage per unit layer of colorant. 


15.8.5 Tbt Hanrahait-Kru* 9 +r Mihiple-Layer Model 

The Hanrahan-Krueger model [190] represents a surface by a series of layers, of 
different materials, where each layer has a different set of descriptive coefficients, 
as shown in Figure 15.34. Like the ocean model of Equation 15.57 due to Nishita 
et al. [323], the Hanrahan-Krueger model explicitly evaluates the reflection and 
transmission of light at media boundaries, and the volume scattering of light within 
layers. 

According to Judd and Wyszecki [232], a layer of material is a thin sheet whose 
thickness is small compared to its length and width. As we saw for the atmosphere 
and pigment models above, we assume a layer is homogeneous in all ways. 

Multiple-layer models combine surface models with volume models in alterna¬ 
tion. A volume model accounts for the structure inside each material, and a surface 
model represents the interface between each adjacent pair of materials. Each ma¬ 
terial and interface may use a different model, or the same model with different 
parameters. 

The material descriptors include the index of refraction, absorption and scattering 
coefficients, depth (or thickness), and the phase function; they use the one-term 
Henyey-Greenstein phase function. 

The algorithm is based on a ID transport model which is solved with a Monte 
Carlo sampling scheme. Using the Fresnel formula to find how much light will pass 
through the outermost surface of the coating, the model then evaluates the scattering 
and absorption within each layer, including the reflection and transmission effects 
at each internal boundary. Hanrahan and Krueger assumed that if a material is 
a mixture of several materials, then the mixture is a uniform and homogeneous 
combination whose coefficients are given by a sum of the component coefficients 
weighted by percentage. 
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neufti i 8.34 

Several layers of material over a surface, and some light interacting with the materials. Redrawn 
from Hanrahan and Krueger in Computer Graphics (Proc. Siggraph ’93), fig. 1, p. 166. 


The BRDF is then described by a combination of the reflection function on the 
outer surface and the internal subsurface scattering handled by the Monte Carlo 
evaluation. Figure 15.35 (color plate) shows an example of a head rendered by 
Lambert shading and by the subsurface model. The skin was assumed to be composed 
of two layers. The outermost skin layer had tissue and pigment particles containing 
melanin, which selectively absorbs light, producing a brown-to-black appearance, 
and scatters strongly in the forward direction. The inner blood-and-tissue layer was 
assumed to absorb green and blue, and to offer substantial isotropic scattering. The 
two columns on the left were rendered using Lambert shading, and the two middle 
columns using the subsurface model. The differences are shown on the right, with 
red indicating where the subsurface model gave off more light, and blue where it 
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gave less. Figure 15.36 (color plate) shows a head produced by using texture maps 
to modulate the thickness and density of the layers. 


15.9 Texture 

The term texture has been used in many ways in the image synthesis literature. In its 
most general sense, anything that is evaluated at a point using only information local 
to that point is a texture. Such local information can include the point’s location 
in space, its position on a surface, the directions and magnitudes of the partial 
derivatives of the surface at that point, a surface normal, the gradient of a scalar field 
evaluated at that point, and so on. The evaluated function often returns a scalar, 
but it can in theory be any data type, including a vector, a color, or any arbitrary 
structure. Usually textures are used as parameters to a shading function, but there 
are notable exceptions. Texture mapping , or the application of texture to geometry, 
represents an important link between geometry and shading. 

A volume texture can be evaluated at any point in space; a surface texture can 
be evaluated only Textures are typically either procedural 

and evaluated on demand by some program, or stored in a table which is accessed to 
find the texture value. The correspondence between the texture table and a surface 
or volume is defined by a texture map. Textures were originally introduced by Blinn 
and Newell [51], and a great deal of work has been focused on texturing since then. 
Much of the work in texturing has been devoted to finding efficient and useful surface 
parameterizations, useful procedural functions, and efficient methods for sampling 
and filtering textures. Discussions of modern texture mapping may be found in 
Heckbert [205] and Watt and Watt [473]. A particularly popular class of procedural 
textures are generated by the noise functions given in Perlin [338]. 

The first application of textures was to specify the color of a surface at every 
point [51]; physically, it was like applying a decal or sticker to a surface. 

Textures have been used in several ways to change (or appear to change) the 
geometry of the surface to which they are applied. The first example is bump 
mapping , introduced by Blinn [47], which perturbs the normal on a surface to create 
what appear to be small wrinkles or bumps on the surface. This trick breaks down 
near the silhouette of the object (because the silhouette is unchanged, the bumps 
implied by the shading are not visible in the geometry), and at near-glancing angles 
to the surface (because there is no blocking or geometric attenuation due to the 
bumps). In general, though, as long as the bumps are very small and the object is 
some distance away, bump mapping is an effective way to imply small deformations 
to a shape without actually changing the geometry. 

The hypertexture method due to Perlin and Hoffert [339] allows us to actually 
change the surfaces of objects. Hypertexture is a volumetric modeling technique that 
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implies surfaces even where explicit surfaces have not been created; it is rendered 
using volumetric methods. Some examples of hypertexture are shown in Figure 15.37 
(color plate). 

The texel-mapping method of Kajiya and Kay [237] maps not just a scalar or a 
coordinate system but an entire surface description onto the surface of an object. 
This bundle of information, called a texel, actually carries a complete shading model 
that may be further parameterized by other values mapped onto the surface. The 
furry bear of Figure 15.38 (color plate) was produced with this method. 

The displacement texture method due to Cook provides another way to actually 
alter the geometry of the surface [100]. Rather than simply perturbing the normal 
at a surface to simulate a wrinkle or bump, displacement mapping actually moves 
the surface by a given amount in a given direction. Rendering displacement-mapped 
surfaces can present a challenge to some systems, particularly when the displacements 
become large, but the basic idea is straightforward. The results are often much better 
than with bump mapping, because displacement-mapped objects actually exhibit self¬ 
hiding, self-shadowing, and a changed silhouette. Some examples of displacement 
mapping along with color mapping are shown in Figures 15.39 and 15.40 (color 
plates). 


15.10 Hierarchies el Scale 

When we are very far from a surface, we generally need only a coarse representation 
of its reflectivity and geometric characteristics. As we approach the surface, we begin 
to see finer detail in both the geometry and the way it scatters light. 

The level-of-detail problem for geometry was recognized early in computer graph¬ 
ics [110]; to achieve real-time performance, only the minimum number of polygons 
needed to represent a shape could be displayed. The basic idea is that we ought 
not to waste time processing detail that is sufficiently small that we could ignore it 
without introducing significant error. In terms of shading models, we should use the 
crudest representation of a shading function that will meet our visual or simulation 
criteria. 

Kajiya suggested [233] that the modeling and rendering level-of-detail problems 
were closely coupled. He suggested that there is a hierarchy of detail in geometric 
models, where increasing levels of detail correspond to the geometry of the model, 
then texture-mapping, and then shading. Furthermore, these levels overlap, as shown 
in Figure 15.41. Although we will speak in terms of surfaces in this section, this 
observation and our discussion are equally applicable to volumes. 

The issues raised by Kajiya’s hierarchies of scale have to do with the size of an 
object that is sampled by a rendering algorithm. If the object is observed directly 
from the viewpoint for an image, then this sample region is directly related to its 



782 


15 SHADING 



A hierarchy of detail. Redrawn from Kajiya in Computer Graphics (Proc. Siggraph ’85), fig. 2, 

p. 18. 


projection on the screen, and this projection may be used to estimate how much of 
the surface is being sampled. But within a complex 3D environment, some surfaces 
that may not be directly visible to the viewer might still be densely sampled; for 
example, a painting on the wall behind your head could still be visible in the curved 
wall of a shiny vase. To complicate the issue still further, the same object may be 
densely sampled from some points in the environment and sparsely sampled from 
others, even within the same image, or during the rendering of a single pixel! 

Even in systems where objects are subdivided and not point-sampled, the correct 
level of subdivision may vary according to different needs (and viewpoints) during 
rendering. Figure 15.42 shows two pixels, each sampled with four samples. A small 
checkerboard is on the right side of the scene, and a mirror is on the left. In Fig¬ 
ure 15.42(a) we see that when we look directly at the checkerboard, our four samples 
land roughly in the four quadrants of the checkerboard, so the region of integration 
for each sample is roughly a square quadrant, as shown in Figure 15.42(b). 

In Figure 15.42(c) we see the world through a different pixel; the samples bounce 
off of the mirror and land on the checkerboard. The sampling pattern on the 
checkerboard is quite different, as shown in Figure 15.42(d). Only three pixels hit 
the board, and the regions over which we integrate the flux from the board are quite 
different. 

Poulin and Fournier [344] discussed the problem in terms of a hierarchy of ge¬ 
ometries, as shown in Figure 15.43. At the highest level is the geometric model itself, 
and below that is a bump-mapped or displacement-mapped version of the geometry; 
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MOURI 15*42 

A scene of a mirror on the left and a checkerboard on the right, (a) A sampling pattern through a 
pixel, (b) The integration regions induced by (a), (c) A sampling pattern through a different pixel, 
(d) The integration regions induced by (c). 
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Geometry 


Bump map 


Anisotropic 

distribution 


Anisotropic 

surface model 


Displacement 

map 



Isotropic microfacets 


nauRi 19.43 

A hierarchy of geometric models. Reprinted, by permission, from Poulin and Fournier in Computer 
Graphics (Proc. Siggraph ’90), fig. 1, p. 275. 
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note that only displacement mapping actually moves the underlying geometry. Then 
comes a texture that parameterizes the underlying fine-scale structure of the surface, 
which may be based on geometry itself (in their case it was a model of parallel 
cylinders). Finally comes the BRDF, which can also be represented geometrically as 
a collection of microfacets. 

Westin et al. [475] described the problem in terms of orders of magnitude of scale, 
as shown in Figure 15.44 (color plate). The largest scale includes phenomena on the 
size of 1 meter. They call this the object scale , and it contains the basic geometry of 
objects in the scene: polygons, patches, and volumetric functions. Two or three or¬ 
ders of magnitude below that comes 1-millimeter-sized milliscale phenomena, which 
is handled by textures of various sorts, including bump maps and texels. Finally, 
two or three orders of magnitude smaller one comes to the microscale , which is the 
domain of particle-sized interactions handled by the BRDF. 

From this point of view, the size of the sampled region indicates the scale of 
the phenomena being sampled, and suggests the precision needed to evaluate the 
sample. If we take a single sample from a house from a distance of 1 kilometer, there 
is probably no need to carefully integrate the burrowing of the light into the sublayers 
of paint on the side; if the house is mostly red, then we can just return red and be 
done with it. But if that sample is near other samples that are only millimeters away, 
then more careful shading models are called for. Westin et al. presented a model that 
can move through these scales when the BRDF is used to model the surface geometry 
[ 475 ]. 

The best way to smoothly move between these scales is still an open problem. 
One method proposed by Becker and Max [35] is to redistribute bump maps so that 
they have the same overall energy output as a displacement-mapped surface or its 
underlying BRDF. The Becker-Max algorithm can then use any of the three methods 
to compute shading based on the size of the interrogated patch, and the overall 
energy coming from any piece of the surface will match that coming from the rest. 
Figure 15.45 (color plate) shows a bumpy teapot in extreme close-up, rendered with 
this method. 

The full answer to the hierarchies of scale problem for the reflectance function is 
not yet in. There seem to be two trends, one that decouples shading from visibility 
(used by the displacement-mapping method) and another that ties them together 
intimately (such as the Becker-Max transition method). Some recent developments 
in the simplification of geometric models [214,443] offer hope that we can handle the 
geometry problem in a semiautomatic way, replacing complex models with simpler 
approximations when appropriate, reducing memory consumption and execution 
times. 
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15.11 Color 

In Unit I of this book, I discussed color and its perception by the human visual 
system. We saw that even though the perceptual color space is three-dimensional, 
an accurate representation of color is not possible when every surface and volume 
is represented simply by three single-wavelength samples. Although many rendering 
systems today continue to render a color picture by essentially computing separate 
red, green, and blue images and then displaying them simultaneously, this model can 
produce images with severe artifacts even in everyday scenes [181]. 

One alternative to the RGB model is to subdivide the visible band into many 
smaller pieces, render an image for each band, combine the results, and then trans¬ 
form the combined data into a single picture suitable for display (perhaps on a CRT 
using an RGB color space). Essentially, this is a supersampling approach in the 
color space; we recognize that sampling only three fixed wavelengths leads to color 
aliasing, so we sample more densely to capture more high-frequency information. 
Once the visible band has been stratified, samples may be taken in the center of each 
band [181], or jittered within the band to trade uniform sampling artifacts for noise 
[154]. 

Meyer has observed [300] that this is a very expensive solution: if the visible 
band is subdivided into relatively coarse 10-nanometer intervals, that still leaves 
about forty samples that must be evaluated; in effect, forty separate pictures that 
must be computed. Since many real spectra are rather smooth, Meyer reasoned that 
perhaps they can be matched by a small number of basis functions. Then we need 
only track the coefficients on the basis functions rather than the many individual 
samples across the spectrum. 

Meyer describes such an approach in [300]. He first derives a color space called 
the A, C\,C 2 space, where the axes pass through the densest regions of the color 
space defined by the CIE XYZ tristimulus curves. This transformation is given by 

A "| [ —0.0177 1.0090 0.0073 1 [ X “ 

Ci = -1.5370 1.0021 0.3209 Y (15.89) 

C 2 J L 01946 -0-2045 0.5264 J [ ^ 

A reasonable question to ask now is, if we are committed to sampling the envi¬ 
ronment with single spectral samples, how many samples should there be and where 
should they be placed to get the best trade-off of effort to accuracy in evaluating A, 
Ci, and C 2 ? Meyer constructed a number of Gaussian quadrature rules using differ¬ 
ent numbers of points to evaluate the three coefficients. He found that a four-point 
rule gave good accuracy when evaluating the colors on the Macbeth ColorChecker 
chart, a standard set of color references [291] (the data for this chart is given in 
Appendix G). The quadrature rule recommended by Meyer is given in Table 15.4. 
Note that none of the color parameters uses all four of the spectral samples. 
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456.4 

490.9 

557.7 

631.4 

A 


0.18892 

0.67493 

0.19253 

C\ 


0.31824 


-0.46008 

c 2 

0.54640 





TABU 15.4 

Evaluating A , C \, C 2 from four spectral samples at four values of A. Source: Data from Meyer, 
Computer Vision , Graphics , and Image Processing , table la, p. 70. 


So we need only sample at four spectral locations in order to evaluate A, Ci, 
and C 2 , and from them find X, Y , and Z using the inverse transformation of 
Equation 15.89: 


' x ' 


' 0.7311 -0.6130 0.3636 ' 


r- a ' 

Y 

= 

1.0030 -0.0124 -0.0064 


Ci 

Z 


0.1194 0.2218 1.7628 


c 2 


A variation on this idea is described by Raso and Fournier [354], They divide 
the visible band into two pieces, split at 555 nm, the peak of the scotopic (night- 
vision) sensitivity curve. Each subband is represented by a cubic polynomial, so 
a color is encoded by eight floating-point numbers. They note that when colors 
are filtered, the order of the polynomial increases, but they then find the best cubic 
fit using Tchebyshev polynomials and retain that. In essence they are weighting 
and combining the first four monomials, which serve as basis functions for cubic 
polynomials. 

Another basis-function approach has been described by Peercy [337]. He rea¬ 
soned that the colors we are most interested in representing in any particular image 
are those that are combinations of the light source spectra and the surface spectra 
of the lights and objects in that image (and higher-order combinations of these as 
well). Peercy started with the spiky spectrum of a fluorescent light, illustrated in 
Figure 15.46, and combined that with four of the colors on the Macbeth color chart. 
He then used characteristic vector analysis to extract the principal vectors that best 
described the spectra produced by combinations of this illuminant and these surface 
colors, and used the most significant ones as the basis functions for rendering a test 
scene. 

Figure 15.47 (color plate) shows the results of using two, three, four, and then 
five of the most significant basis functions to render the scene, and also using the 
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HOUR! IS.46 

The spectrum of a fluorescent light. Redrawn from Peercy in Computer Graphics (Proc. Siggraph 
’93), fig. 1, p.194. 


square of those counts to determine the number of equally spaced point samples in 
the visible band. In the left column, the two-basis function image is somewhat dark 
and lacking in green, but the three-basis function image seems indistinguishable from 
the five-basis image. 

But in the right column, the results are not nearly as good; the four-sample image 
is badly distorted, the nine-sample image has a yellow cast, and the reflections in 
the cube have a much reduced contrast. The sixteen-sample image still has a bit too 
much yellow. Finally, the twenty-five-sample image looks very close to the five-basis 
image. 

These images are only test cases and do not tell us the quantitative error, but we 
can see that for this scene, equal-interval spectral sampling requires at least sixteen 
samples to do an even approximate job of estimating the color (if the sample locations 
were not evenly spaced, but instead had been analyzed and optimized for the best 
locations, the images might have been better). 

Point-sampling methods retain the advantage of being able to adaptively sample 
regions of high-frequency content, and easily support adaptive supersampling, but 
they are prone to aliasing and other undersampling artifacts. The choice of function 
used for reconstruction from point samples is also important; we often reconstruct 
using boxes that extend to the midpoints of adjacent samples, but a smoother filter 
such as those discussed in Unit II would probably yield better results. 

Basis-function methods give continuous results that converge quickly, but they 
require that we select good basis functions beforehand, which can be difficult. 
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15.12 Further Reading 

General discussions of surfaces and materials may be found in the books by Judd and 
Wyszecki [232], Wyszecki and Stiles [489], and Siegel and Howell [406], RenderMan 
is a proposed standard which includes a shading language; Upstill details the language 
and offers many examples in his book [446]. Discussions of shading models for 
computer graphics are offered in books by Hall [181], Sillion and Puech [409], and 
Watt and Watt [473]. 

Good summaries of what goes on inside materials may be found in Turner’s book 
on paint [444] and Leverenz’s book on general luminescence [267]. 

Physically based shading models are often based on principles of optics. Some of 
the many good optics books are Moller [311], Born and Wolf [55], and Jenkins and 
White [230]. The thermal radiation literature surveyed in that field’s classic text by 
Siegel and Howell [406] also contains a wealth of material. 

Discussions of scattering in volumes are given in the books by Bohren and Huff¬ 
man [53], Denman et al. [122], and McCartney [292], 

An interactive program for exploring the HTSG model has been written by He et 
al. [197] and is available on CD-ROM in the Siggraph ’92 proceedings. 

Finding efficient representations for Earth’s atmosphere and atmospheric effects 
has been a topic of study since the first shading models were developed. Some 
references to this work include the papers by Blinn [49], Inakage [225], Kajiya 
[236], Klassen [247], Max [286,287], Nishita et al. [320,323], Tadamura [430], 
and Zibordi [506]. The books by Lenoble [266] and McCartney [292] provide lots 
of material for this topic. 

A good introduction to how light behaves inside many different varieties of 
crystals may be found in Wood’s book [488]. 

Three very interesting and enjoyable books deserve special mention. Greenler 
[170] presents the results of numerous ray-tracing experiments designed to simulate 
complex atmospheric phenomena. Minnaert [305] addresses many fascinating topics 
on the nature of light in the atmosphere, including a method for observing “with 
our naked eye, unaided by any instrument, that the light from the sky is polarized!” 
Meinel and Meinel [296] discuss both rare and common phenomena like twilight 
colors, the “green flash,” the effects of volcanic dust on sunsets and sunrises, and 
auroras. These books all reward a few evenings’ investment with a lifetime of 
increased awareness and pleasure in the sky above us. 


15.13 Exercises 

Ixtrciit 15*1 

Describe the difference in appearance between objects shaded with the Phong and 
Blinn-Phong models. 
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ixtrcist 15.2 

Review the paper by Lewis [273] on adding energy conservation to Phong and 
Torrance-Sparrow shaders. Do you think that the addition of energy conservation 
improves these models, or do their assumptions overwhelm the corrections? 

IxsrciM 15.3 

Review the empirical shading model proposed by Schlick [381] and compare it 
against the other empirical models in this chapter. Which would you select for a 
production rendering system? Why? 

ixtrcist 15.4 

Use Equation 15.9 to show that Rp —► 1 as 6 —► 0. 

Ixtrcist 15.5 

Use Equation 15.76 and a c = a a /(r s to derive Equations 15.77 and 15.78. 

Ixtrcist 15.6 

Nishita et al. actually state Equation 15.40 in terms of the angle a rather than cos a: 

P/ l (a) = 1 4-9cos 16 (a/2) 

P/ l (a) = 14- 50 cos 32 (a/2) 

Show that these equations are the same as Equation 15.40. 



Inversions solve problems. 

Gary Marks 

(“The Gary Marks Piano Method, Book II,” 1992) 



INTEGRAL EQUATIONS 


16.1 Introduction 

At the end of Chapter 12 we arrived at Equation 12.98, which presented the integral 
form of the transport equation. This equation describes the flux (or energy) flowing 
in every direction, everywhere in space. Although the flux $(r, u;) appears on the left- 
hand side of the equation, it also appears on the right. Thus, rather than providing 
an explicit representation for the flux, this equation gives an implicit condition that 
the flux must satisfy. 

As we have seen in Chapter 13, radiance is a more useful characterization of 
the light energy in a scene than the raw flux from which the radiance is derived. 
Our first goal in Chapter 17 will be to rewrite the transport equation in terms of 
radiance, giving us the radiance equation. The radiance equation is the keystone of 
image synthesis based on geometrical optics. Our primary job in image synthesis 
is to find a useful approximation to the radiance function defined by this equation; 
that function tells us the precise color of every point in an image. 

The radiance equation will have the same form as the transport equation; that 
is, it will specify a function in terms of an integral that contains that function. This 
type of equation is called an integral equation . There are many methods for solving 


792 


16 INTEGRAL EQUATIONS 


integral equations, and we can look over the history of image synthesis algorithms 
and categorize nearly all of them as computing different approximations to this 
equation, even long before it was explicitly realized in graphics that such a unifying 
equation existed! The most successful (and popular) image synthesis algorithms are 
so close to the standard integral equation methods that it seems remarkable that they 
were independently developed. 

We will discuss various rendering algorithms in mderadian^ IV in terms 
of how they relate to standard procedures for solving integral equations in gen¬ 
eral. Therefore we first need to review the theory of integral equations and discuss 
techniques for their solution. 

The central goal of this chapter will be to introduce the relevant theory of inte¬ 
gral equations and survey some of the more popular techniques for approximating 
solutions. Because our goal is not functional or numerical analysis per se, but image 
synthesis, we will be rather informal when discussing solution algorithms, and we 
will generally not cover issues of convergence, the existence of inverses, guarantees 
of continuity and stability, and other important mathematical issues. These topics 
are discussed with precision in the literature, and we leave the reader who desires 
such rigor to consult the references for all the details. 

Choice of notation is always a difficult issue when alternatives abound, as they 
do in the integral equations literature. Much of the literature writes the unknown 
function as /(s), a real-valued function of the real parameter s. In keeping with our 
notation of Unit II, we write this instead as x(t). Most of the rest of our notation is 
similar to that of Kanwal [240]. We will discuss integral equations in this chapter 
in their ID, one-parameter form for an unknown function x{t). Many of these 
methods will generalize easily to the multidimensional, multiparametric equations 
used in image synthesis. 

16.2 Types off Integral Equations 

Any equation that describes some function in terms of one or more integrals of that 
function may be called an integral equation . In practice, just a few different general 
structures of integral equations seem to capture most mathematical models of natural 
phenomena, including the radiance equation. 

To help set the stage, consider the following integral equation: 



(16.1) 


Equation 16.1 has the general form we will be most interested in for image synthesis. 
In this equation, we are given everything but the unknown function x(£), which we 
want to find. 
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The function x(t) is a real-valued function of the independent real variable t . The 
real function g(t) is called the free term or the driving function . The value A is in 
general a complex number. The integral involves a real function k(t,u) of two real 
numbers; this function is called the kernel of the integration. In this chapter, we will 
always use x(t) for our unknown, and letters such as u and v for dummy variables 
of integration, which are always considered real numbers. 

We can now describe the different classes of integral equations based on what 
characteristics of this general form they share [120,251]. In general, an integral 
equation is described by a sequence of adjectives, defined below in a question-and- 
answer format. 

Name: What is the upper bound of the domain of the integral? 

Fredholm: Some real number 6. 

Volterra: The evaluation point t. 

Kind: Where does the unknown function appear? 

First kind: Inside the integration only. 

Second kind: Both inside and outside the integral. 

Third kind: Both inside and outside the integral, and weighted on the left-hand 
side by a function //(£), which is zero for at least one t in the domain of 
integration. 

Singularity: Is the integral proper? 

Singular: The integral is singular (or improper ), if the domain is infinite, the 
integrand is unbounded somewhere in the domain, the kernel is discon¬ 
tinuous, or a combination of some or all of these. 

Nonsingular: The integral is not improper. 

Homogeneity: Is the driving term zero? 

Homogeneous: g(t) = 0 throughout the domain. 

Inhomogeneous: g(t) ^ 0 somewhere in domain. 

Linearity: Is the unknown function a linear term in the integral? 

Linear: The integral is linear in x(t). 

Nonlinear: The integral is not linear in x(t). 
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Fredholm 

Volterra 

First 

kind 

,6 

<?(£) = X 1 k(t,u)x{u)du 

J a 

g(t) = A / k(t,u)x(u)du 

J a 

Second 

kind 

x(t) = g(t) + A 1 k(t, u)x(u) du 

J a 

x(t) = g(t) + A I k(t,u)x(u)du 

J a 

Third 

kind 

fi(t)x(t) = g(t) + A f k(t,u)x(u)du 

J a 

fi(t)x(t) = g(t) + A J k(t,u)x(u)du 


Classification of integral equations. 


Examples of some of these classes are shown in Table 16.1. A rich taxonomy of 
integral equations and their various relationships may be found in Golberg [162]. 

When an equation is homogeneous (the free term is zero), it may have only the 
trivial solution x(t) = 0. However, for some values of A there may be nontrivial 
solutions. Such values of A are called characteristic values for the equation, and the 
corresponding functions are called the characteristic functions [93]. 

In this book, we will focus exclusively on the category occupied by the radiance 
equation, which is a linear, inhomogeneous, Fredholm integral of the second kind . 
Happily, this is a common form of integral equation, and its study occupies much of 
the literature. We will refer to this class of integral equations with the notation 
e.g., Equation 16.1 is a member of the class 

Singularities arise often in the radiance equation. Unfortunately, the effective 
treatment of singularities often involves knowing something about the kernel; this is 
expensive information in computer graphics, since the kernel describes not only the 
light arriving at a point from every surface in the scene, but the visibility of the entire 
scene from that point. So for most of this chapter, we will focus on nonsingular 
integral equations. A kernel with a finite number of simple discontinuities can often 
be replaced by a finite number of continuous kernels; true singularities (such as sharp 
shadow edges) may require more subtlety. We will discuss methods for dealing with 
singularities in Section 16.10. 

We will be rather informal in this chapter regarding the details of our function 
spaces. In general, we will make the overly strong assumption that all of our 
functions live in a Hilbert space (a linear space that is complete , that is, every 










16.3 Operators 


795 



PI0URI 16.1 

(a) x(t) = t 2 . (b) The new function ( Tx)(t ) = sin(< 2 ). 


converging sequence of elements converges to something in the space), has a real¬ 
valued norm ||x||, and has a norm-derived inner product (x| g). Appendix A provides 
some background for these terms. 


16.3 Operators 

When equations get typographically complex, they can be difficult to understand 
even if the concepts are straightforward. To keep the notation simple, we will use 
operator notation in this chapter. We saw operators in Unit II, but we didn’t do very 
much with them. Our use here will be much heavier and more varied, so we will 
review the notation here. 

An operator is similar to the idea of a functional [251]. Just as a function such as 
s\n(t) takes a real number and returns a real number; a functional takes a function 
and returns a new function. For example, a simple functional T may be defined 
( Tx)(t ) = sin(x(£)), so for x(t) = £ 2 , (Fx)^) = sin(£ 2 ), as shown in Figure 16.1. 

Note that {Tx) is a new function of the parameter t and not simply a composite 
of two function calls. To see this, note that x(t) = t 2 is structurally a quadratic, 
so we can do things like differentiate it and find its global minimum, but the new 
function (. Tx)(t) has an infinite number of local minima, no global minimum, and 
is not differentiable at t = 0. Another pair of example operators are X>, which takes 
the derivative of a function, and <S, the scaling integral operator that integrates a 
function multiplied by cos(x) from [a, 6]. For example, for the function f(x) = sin(x), 
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differentiation gives (Vf)(x) = (d/dx)smx = cos(x), and scaled integration gives 
(<S/)(x) = f b sin(x) cos(x) dx = sin 2 (x)/ 2. 

To represent the effect of a functional on a function, we use an operator. An 
operator is written to the left of the function it modifies; the resulting new function 
uses the original argument. So as above, the operator T applied to a function x(t) is 
written (.Fx)(£), where the parentheses around (Tx) are intended to remind us that 
this is a new function. 

There exists a notation for defining operators, but it’s easier to present them by 
demonstrating their effect on a generic function x(£). We will initially be concerned 
with just two operators, the identity and kernel integral operators. 

As its name suggests, the identity operator 1 does nothing to its argument: 

(Xx)(t) = x(t) (16.2) 

The kernel integral operator K, takes a real-valued function x(t) and integrates it 
over a domain [a, 6], as scaled by a real-valued function k(u , v) of two real parameters 
according to 

(ICx)(t) = f h(t,u)x(u)du (16.3) 

Ja 

The function k is called the kernel of the integration. Note that /C is linear: 

IC(f + 9) = ICf + Kg 

IC(af) = alCf (16.4) 

This should be no surprise, since we know that integration is linear and /C is simply a 
notational tool for representing a particular type of integration. Equation 16.4 shows 
an additional bit of simplification common in operator expressions: the dependent 
variable is often suppressed. So K(f 4* g) is understood to represent the function 
(/C(/ 4- g))(t)> and ICx stands for ( Kx)(t ). 

We can use this notation to rewrite Equation 16.1 in a more compact form: 

f b 

x(t) = g(t) + \ / k(t, u)x(u) du 
Ja 

x = g 4- A ICx (16.5) 

Recalling the identity operator J, we can revise this equation as 

x A/Cx — g 

(: l-\JC)x = g (16.6) 

Equations 16.5 and 16.6 are the operator forms for equations of type 
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We will sometimes find it useful to think of the operator /C as computing the inner 
product of a constant-^ “slice” of the kernel k(t , u) with the function x(u). Writing 
k t (u) for the kernel at a particular value of £, we then have 

(ICx)(t) = ( k(t,u)x(u) du = (kt\x) (16-7) 

J a 

using our definition of the braket from Section 4.3.9. If both k and x are real, as we 
will suppose throughout this chapter, then the braket is bilinear and symmetric. We 
will not bother to explicitly conjugate the first term since we know it to be real, and 
(a| b) = (a\ b) for a € 11. So the symmetry condition may be stated 

ICx = (A; t | x) = (x\ k t ) (16.8) 

Any operator may have an inverse . For example, the operator defined by Tf = 
/ -f 2 has the inverse T~ x f = f — 2. We will suppose in this book that all inverses 
work both before and after the operator: TT~ X — T~ X T — X. Using an inverse, we 
can “solve” Equation 16.6 for x: 

x = (X-X K)~ l g (16.9) 

assuming that such an inverse exists. Right now Equation 16.9 is just a notational 
device; we don’t yet know how to actually find an x to satisfy it. 

Because the central equation of study in this chapter is given by (X - A K)x = g , 
we will find it notationally convenient to represent the composite operator on x by 
the single operator C: 

C = l — XK. (16.10) 

This allows us to write our integral equation in a particularly succinct form: 


Cx = g 


(16.11) 


so that x = £~ l g. Our goal will be to find methods for evaluating (or estimating) a 
function x satisfying this equation. 

A pair of operators A and A* are said to be adjoint if and only if they satisfy the 
relationship 

(Af\g) = (f\A*g) (16.12) 

for all / and g. For the integral operator /C, this is equivalent to saying 


(ICx)(t) = ( k(t,u)x(u)du 
J a 

(IC*x)(t)= j k(u, t)x(u) du 
J a 


(16.13) 
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The kernel k(u, t) is called the adjoint kernel , and the second line above is called the 
adjoint equation for the operator /C. If the kernel is symmetric , then k(u , t) = k(t, u) 
for all u and t ; because such a kernel is its own adjoint, we say its related operator 
/C is self-adjoint : £ = /C*. We will use adjoint equations later in this chapter. 

16.3.1 Operator Norms 

In general, linear operators map functions from a domain X to a range F, which 
may be in different spaces. Because they are linear; the set of all such operators 
themselves forms a linear space. We can define a norm on this space by finding the 
largest magnifying effect that an operator A has on any function x in the domain 
space X: 

Mil = sup y = sup ||-4x||y (16.14) 

l|x||#o Imlx ||a:||=i 

where the subscripts indicate the space in which each norm is taken. In words, this 
tells us to look at every element in the space X, apply the operator to it, and then 
find the ratio of the norm of the result (in the space of the result) with the norm of 
the input (in the space of the input). The largest ratio is the norm of the operator. 
Intuitively, if we place a unit sphere at the origin in space X and apply the operator 
A to it, the sphere will in general be turned into some blob, where ||w4|| is the largest 
radial distance from the origin to the surface of the blob. 

We have defined the operator norm with respect to two spaces: a domain X and a 
range F. Often we will only deal with subspaces of X; that is, functions x may come 
from one subspace X\ and be mapped into X 2 , but both X\ and X 2 are contained 
in X. 

Because the operator norm is the largest stretching that can occur; we can say for 
any input x: 

\\M\ < Mil • 11*11 (16.15) 

From this, we can find an important result describing the combination of two oper¬ 
ators A and B: 

\\AB\\<\\A\\-\\B\\ (16.16) 

An operator A is said to be bounded if \\A\\ < 00 . 

The norm of C is ||£|| = || 1 — A/C|| = 1 — A||/C||. The norm of its inverse, ||£ _1 ||, 
is not decomposable in the same way. 

16.4 Solution Techniques 

Golberg [162] has identified five principal categories of methods for solving integral 
equations, which we summarize here. 
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Analytical and semianalytical methods: These techniques rely at least partly on an 
analytic representation of the solution. This generally requires an analytic 
representation of all the unknown components in the equation, including the 
kernel. Unfortunately, practical (that is, nontrivial) image synthesis problems 
involve kernels that are unlikely to be completely representable analytically. 

Kernel approximation methods: We find a sequence of kernels K n {t,u) for n = 
1,2,... so that k n (t , u) converges to k(t, u) as n -* oo. For each kernel, we 
find an approximate solution x n . Note that each x n itself is an approximation 
to the true solution for the kernel £ n , which is in turn an approximation of 
the ideal kernel /C. This method is typically only useful for degenerate kernels 
(discussed below). 

Projection methods: These are the most popular methods in practice, and their dis¬ 
cussion will occupy the bulk of this chapter. A projection method converts 
the original equation into another equation in some other, smaller space. For 
example, we might look for the best solution only among the polynomials, or 
sums of sines and cosines. Projection methods allow us to restrict the domain 
of possible solutions to our equation, which can make them easier to find. 

Quadrature methods: We replace the integration represented by the kernel with a 
finite summation. This approximation to an integral by a sum is called numer¬ 
ical quadrature . We can view quadrature methods as a subclass of projection 
methods by thinking of the quadrature operation as the projection of the inte¬ 
gral operator into a finite-dimensional subspace represented by the summation 
operator. 

Volterra and initial value methods: Rather than solve the given equation, we can 
sometimes show that the solution would satisfy some other set of equations, 
and solve those instead. 

The kernel operator K, in Equation 16.9 is usually too complicated for us to 
solve the original integral equation analytically. In computer graphics, the kernel 
represents the transfer of energy to a point from potentially everywhere else in the 
scene; this transfer can be arbitrarily complex and can easily baffle our best analytic 
techniques. There’s no hope of finding a general, analytic solution to the radiance 
equation in practical situations. 

The analytic, semianalytic, Volterra, and initial value methods all require us to 
know something about the kernel and driving functions. These techniques may be 
useful in rendering to handle special cases where the functions are tractable, but this 
has not been explored much except to suppress singularities (discussed later). 

So we turn instead to approximate solutions; in this book we focus on the quadra¬ 
ture and projection methods. We will also, however, discuss two important analytic 
methods that lend key ideas to the approximation techniques. We will call them sym¬ 
bolic methods here, because we will not actually evaluate anything analytically, but 
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rather juggle symbolic representations into a variety of useful alternatives. Projec¬ 
tion methods usually involve approximating everything in sight by linear equations, 
resulting in a single big matrix equation. 

No matter how we do it, note that if we change the structure of the problem, then 
even a perfect answer to the new problem may have little relevance to the original 
problem. That is, starting with the original problem of finding 

x = (1 - \K)~ l g (16.17) 

we may replace the kernel and the driving function by approximations /C and g: 

x = (1 - \K)~ x g (16.18) 

An exact solution x to this problem may be identical to the original x, or very similar 
to it, or completely unrelated to it. And if the solution is only an approximation to 
x, then we’re yet another step away from the desired answer. 

To properly track the correspondence of our approximations to the ideal solu¬ 
tion requires that we carefully monitor the error introduced at every step in the 
approximation. Usually we can’t express the error exactly, and instead settle for an 
error bound , or a guaranteed upper limit on how much error can be introduced. 
These bounds are often conservative , meaning that when there is any uncertainty, we 
normally use the worst possible error. Alternatively, we may use probable bounds, 
which aren’t as pessimistic as conservative bounds, but also don’t come with the 
same assurance. In general, bounds can only be estimated, though we believe in 
some Platonic ideal bound , or perfect bound , that would indicate the error if only 
we had the tools to find it. The tighter an error bound is, the more closely it matches 
this perfect bound; a loose bound is generally suspected of being rather far from the 
ideal value. 

Tracking the error at every stage is difficult business; it requires patience and 
great attention to detail [353]. Furthermore, the error in the realization of any 
algorithm is also highly dependent on the details of its particular programming and 
host hardware. Much can be said about the best or expected error properties of these 
algorithms, but this is only part of our puzzle in image synthesis, where shading, 
visibility, and other algorithms that we use to solve the rendering equation have 
their own errors. The interested reader can find detailed error analyses for integral 
equation methods in the references listed in the Further Reading section. 


16.4.1 RmIM Minlaiizotioa 

We now turn to looking for solutions of x = C~ l g in general. Methods of finding 
the “best” version of a function often come down to minimizing the value of some 
norm over a class of candidates. A common method for describing this approach in 
general is called residual minimization . 
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We begin by recalling the composite operator C = (I - A/C), satisfying Cx = g. 
We now introduce the residual function r n and the error function e n for some 
approximation x n : 


r n — g Cx n 

e n — x - x n (16.19) 

To evaluate r n , we don’t need the true solution x . By subtracting g - Cx = 0 from 
r n , we can derive the identity 

r n = (g - Cx n ) -(g- Cx) 

= £(z - X n ) 

= C n (16.20) 

so the residual function is just the error function passed through the composite 
operator C. 

To find the function x n that comes closest to x, we can try to minimize the norm 
of the residual error, measuring the distance between g and Cx , yielding the “best” 
approximate function x n . The choice of norm exerts a great deal of influence on our 
selection of an approximate solution x n . We will see that different algorithms may 
be distinguished on the basis of which norm they attempt to minimize. 


16.5 Degenerate Kernels 

We mentioned above that some special methods are available for degenerate kernels. 
In fact, methods have been developed for a wide range of specialized kernel forms, 
of which degenerate kernels are only one example. Degenerate kernels are those that 
can be defined as the product of a number of one-parameter functions: 

n 

k(t , u) — y di(t)bi(u) (16.21) 

i= 1 

A separable kernel is composed of only one such pair (that is, n = 1): 

h(t, u) = a(t)b(u) (16.22) 

A degenerate kernel may be written as a sum of separable kernels. 

When an operator applies a kernel of this type to a vector x, we can find the result 
with an elegant symbolic construction [343]. The basic idea is that in an integral 
equation x = g + A /Cx with a degenerate kernel, we can think of the functions a* 
as forming a basis for some function space. It turns out that the solution x may be 
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represented in this space by solving for its coefficients on the functions a; (it’s okay 
if the a,i are not linearly independent; we discuss the way to handle this below). 

We begin by expanding out the operation: 


(ICx)(t) = f k{t,u)x{u)du 
J a 

pb n 

= / } dj ( t)bj (u)x(u) du 
J a i =1 

n pb 

= ^ai(t) / bi(u)x(u)du 

i=l 


i=1 


(16.23) 


We note that a scaling factor A applied to ICx can be moved inside the braket, 
replacing it by a new constant 7 


n 

\(ICx)(t) = '^2a i (t)\(bi\x) 

i =1 
n 

= (16.24) 

1 = 1 


Recalling x = (7 4 - A/Cx, we would like to find the values for 7 * that describe x. 
We begin by writing 


x(t)-g(t) = (\JCx)(t) 

n n 

^ ai (i) 7 i = (bi\x) 

i=l i= 1 

n 

= \Y,“i(t)(bi\(g + \ICx)) 
i =1 
n 

= «&i| < 7 ) + ( 6 i| A/C*)) (16.25) 

1=1 

Now we have an expression for A(/Cx) from Equation 16.24, so plugging that into 
the rightmost term and using linearity, we find 
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n n s / n \ 

^2 a i(*)7i = ( ( b i\ 9 ) + y>i a Hi 

i= 1 i=l ' ' j =1 1 

n / n 

= A^ai(0 (<6i|3) + 7j <^i| «j> 


where 


n / n \ 

= a< ( f ) ( & + H 7>«ij ) 

i=l ' i=l ' 

0i = A / bi(u)g(u) du 
J a 
,6 

a *i = A / bi(u)cij(u) du 
J a 


(16.26) 


(16.27) 


We can find values for each and /?*, so all that remains is to find the 7 *. 

For convenience, we will assume that the functions a*(£) are linearly independent. 
This isn’t a restrictive assumption; if these functions are not linearly independent, 
we can reexpress the kernel in terms of a smaller number of basis vectors a'^t) that 
span the space of the old a,i(t). Then Equation 16.26 represents n equations in the 
n unknowns 7 i, each of the form: 


7 i — ^ 1 Qijlj “J" Pi 


(16.28) 


In matrix form, we can write g = Ag 4 - b, or (I — A)g = b. In tableau form, 


1-On OL\2 


021 1 — <*22 


(16.29) 


1 — Onn 7n 


If (I - A) is nonsingular, it can be inverted, g = (I - A) x b, yielding the unique 
set of 7 i that describes the function satisfying the original integral equation: 

n 

x(t) = g(t) + ^2 lidiit) 


(16.30) 
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PIOURI 16.2 

The Fubini theorem, (a) Scanning across u and sweeping up v. (b) Scanning up v and sweeping 
across u. 


This is our first example of a solution technique for solving an integral equation. 
We started by assuming something about the problem (here it was the form of the 
kernel), and then we used that assumption to simplify the problem. This is typical 
of most solution methods. 


16.6 Symbolic Methods 

In this section we will consider a number of symbolic manipulations to Equation 16.9 
that are intended to make it more tractable for solution. 


16.6.1 Thm Fvblnl ThGorMi 

We begin with a simple observation that we will find useful in this section. Suppose 
that we have a continuous, real function b(u , v) of two real parameters, and we want 
to evaluate the double integral: 


/ / b(u, v) dv du 

J u=0 J v=0 


(16.31) 


We can see from Figure 16.2(a) that this is the area of the triangle defined by 
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0 < u < s, v < u. In effect, we are moving across the u axis and sweeping upward 
from v = 0 to v = u at each step. 

We can reverse the order of integration, as shown in Figure 16.2(b). Now we 
move up the v axis, sweeping right from u = v to u = s at each step, covering the 
same territory bottom to top and left to right instead of in the opposite order. Stated 
symbolically, this is the Fubini theorem: 


rs ru rs rs 

I / b(u,v)dvdu= / / b(u,v)dudv 

J u=0 J v=0 J v=0 J u=v 


(16.32) 


16.6.2 SvccMshro Substitution 

Perhaps the most straightforward method for solving Equation 16.1 for the unknown 
function x starts with the form in Equation 16.5: 

x = g 4- XfCx (16.33) 

Since we have an expression for x on the left-hand side, we can simply plug that into 
the right-hand side: 


x = g + A Kx 
— g 4- A IC(g + A Kx) 

= g + \ICg + \ 2 >C 2 x (16.34) 

we can then repeat the whole process: 

x = g + XJCg -F X 2 IC 2 x 
= g + XKg + X 2 )C 2 (g + XK,x) 

= g + XKg + X 2 JC 2 g + A 3 /C 3 x (16.35) 

and so on. If we stop after n steps, then we get a recurrence relation for the n-step 
estimate x n : 


Xfi — g A/Cx n _i 

n — 1 

= £(A JCYg (16.36) 

i=0 

where we have dropped the highest-order term (A/C) n x; we will see why this is 
reasonable in the next section. This relation defines the technique of successive 
substitution . We will look at its error properties in the next section. 
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16.6.8 Hmmmamm SgHos 

In successive substitution we replaced the estimated solution at each step. Alterna¬ 
tively, we can iterate replacements on the operator. We get the same result, but from 
a different point of view. 

We begin by recalling that for a complex number 2 , with \z\ < 1, we can write 
the infinite series 

1 00 

-- = ^ ] 2 * = 1 4* 2 4* z 2 4* z^ + • • • (16.37) 

i=0 

Using this series as inspiration, think of 2 as an operator. It can be shown that under 
certain reasonable conditions, interpreting Equation 16.37 as an expression for an 
operator with a norm less than one is valid [343]. 

Then we can write an expression for x in terms of the operator (1 - A/C)” 1 , and 
use Equation 16.37 as an approximation of that operator: 

x = (1 - A/C) -1 # 

1 

~ I-A K. 9 

OO 

= (16.38) 

t=0 

Terminating the expansion after n terms gives us an n-step approximation x n to x: 

n 

x n = j>/cr <7 (16.39) 

i=0 

This approximation is called the Neumann series . This formula is identical to 
Equation 16.36. 

If we continue substituting forever, we get 

n 

lim'y£ n = R (16.40) 

i=0 

where x = g+Kx . The operator 1Z is called the resolvent operator , and it implements 
the resolvent kernel. 

For example, after the second step we have (K?x)(t) = (K(Kx))(t). We can write 
out this kernel explicitly as 

(IC 2 x)(t) = (K,{Kx)){t) 

= I k(t,u)((K,x)(u))du 

J u =0 
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pS pu 

= / k(t,u) / k(u,v)x(v)dvdu 
J u= 0 J v=0 

pS pu 

= / / k(t,u)k(u,v)x(v) dvdu 

J u=0 J v=0 

Now we can use the Fubini theorem to switch the order of integration: 

pS ps 

(IC 2 x)(t) = / / k(t,u)k(u,v)x(v) dudv 

J u=0 J u=v 

= \ k(t,u)k(u,v) du x(v)dv 

J v —0 L J u=v 

= / k 2 (t, v)x(v) dv 

Jv =0 


= (£ 2 *)(0 


where the kernel of 1C 2 is 


fc 2 (£, v) = / k(t,u)k(u,v) du 

J u=v 


In general, the iterated kernel of order n is given by 


k n (t,v)= / k(t, u)k n -\(u,v) du 

J U — V 


MM) = I 


(16.41) 

(16.42) 

(16.43) 

(16.44) 


Since we’re now focusing on iterating the operator rather than the approximate 
solution, we are tempted to analyze the error from the same point of view. Following 
Arvo [14], we define the operator M n as the result of n steps of this series: 

n 

M n = J2* iKi (16.45) 

1=0 

So the ideal solution is given by x = Moc9- 

The error in x n is the error involved from using M n instead of Moo' 


oo n 

||Moo-Mn||= £ XV -£\^ 

t=0 i=0 

oo oo oo 

E xi]Ci ^ E Ill'll ^ E wr (16.46) 

t=n+1 t=n+l i=n+l 
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using the triangle inequality from the definition of a norm, and Equation 16.16. 
Since ||A/C|| is a real number and recalling the series identity 



j=n+l 


,n+1 


1 — a 


(16.47) 


we can write the error in the approximate operator M„ as 

<16 - 48) 

Returning now to x n = Mn9-> we can find its error \\x — x n || as 


\\x - X n || = HAfoo^ - M n g\\ < \\Moc - Mn\\ • lltfll 


\m 

1 - ||A/C|| 


(16.49) 


So the error depends on the magnitude of 1 - ||A/C||. If ||A/C|| < 1 , we see that the 
difference between the true solution x and successive iterates x n from the Neumann 
series goes to zero as n —► oo. 

Note that the kernel of the approximate operator M n may be expressed as the 
sum of the first n iterated kernels: 


n 

m„{t,v) = ^ki{t,v) 
1=0 


(16.50) 


16.7 Numerical Approximations 

The symbolic methods discussed in the previous section are useful for thinking about 
the problem of finding the unknown function, but they are not immediately practical 
in computer graphics. Numerical methods hold much more promise for quantitative 
solutions, so we now turn to numerical algorithms for finding the unknown function 
x. Here we simply take the integral equation as given and attempt to replace it with 
a computable approximation. Development of a good numerical algorithm is not a 
casual task; we must be scrupulous in every aspect of the design and implementation, 
including the effects of word size and floating-point resolution in a particular ma¬ 
chine. A discussion of some of the pitfalls in designing good numerical algorithms 
may be found in Press et al. [348] and Ralston and Rabinowitz [353]. 

In general, each approach will find an approximation to x that best matches a 
set of conditions; sometimes we will need to search a space of functions to find that 
best match; other times we need simply solve a matrix equation. 
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16.7.1 Numerical Integration (Quadrature) 

Most of the numerical algorithms that we will discuss for solving integral equations 
end up computing one or more ID integrations along the way. The speed and 
accuracy of different integration routines, when applied to different problems, can 
vary tremendously. A discussion of different algorithms and their trade-offs may be 
found in Press et al. [348], Ralston and Rabinowitz [353], and Delves and Mohamed 
[120]. For completeness, we present here a short introduction to the subject. 

Any numerical method for computing an approximation to an integral is called 
a quadrature rule . We can write the perfect integration we desire as the operator /C 
applied a weighted function x(t): 

(JCx)(t) = j w(u)k(t,u)x(u) du (16.51) 

J a 

where w(t) is a weight function . Although the operator /C can in principle use any 
information (given or measured) available about the function to improve the quality 
of the integration, we will focus on methods that use only the value of x at a set of 
points given by {U}. (We will label these points as {£*} when they appear outside 
of an integral, but they will take on the dummy variable—typically u —inside the 
integral, where they appear as {t/*}.) We write this quadrature rule as an operator 

Q: 

Qx = /Cx - £qx (16.52) 

where the operator £q measures the error between the estimated integral Qx and the 
actual value /Cx. 

This type of quadrature rule may be written 

N 

( Qx)(t) = ^2 mk(t, Ui)x(ui) = { Q\ x) (16.53) 

1 = 1 

In words, we measure the value of x at each point U{ , weight each measurement by 
an associated value w t , and then add the product into a running sum. The points 
Ui are called the quadrature points or abscissae , and the weights Wi are called the 
quadrature weights . The trick in designing a good rule is to choose the U{ and Wi 
that will make the estimate as good as possible. 

There are three general classes for rules of this type [120]: 

Automatic rules: Neither the number of points AT, nor the points Ui themselves, 
are determined in advance. Monte Carlo algorithms are examples of this 
approach. 

Optimal rules: Points and weights are chosen in advance so that for some class of 
functions X G , the value 

sup |/Cx — Qx | 
xex 0 


(16.54) 
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is minimized over all functions x € X Q . 

Error annihilation rules: Points and weights are chosen in advance so that for some 
class of functions X a , 

\ICx-Qx\ = 0 (16.55) 

for all functions x € X a . The class X a is called the annihilation class for the 
rule (an annihilation rule is also an optimal rule, though the optimization class 
X 0 for a given X a may be difficult to determine). 

In this section we will focus exclusively on annihilation rules. 


16.7.2 MqHiocI of UndotormiMfl Cooff l d—its 

We start our study of quadrature rules with a straightforward construction. We will 
assume that we have a set of N quadrature points {s*}, and we want to find the 
weights for a particular type of rule. We will only look for solutions in a function 
space of finite dimensions. 

In general, we will suppose that the ideal solution x lives in some abstract function 
space X . Each finite-basis method has access to an ra-dimensional subspace X n C X, 
which is spanned by a set of n basis functions {hi}. Therefore the n-dimensional 
approximation function x n selected by the method may be given by 

n 

Xji — ^ ^ Qtjhj (16.56) 

»=i 

which suggests interpreting x n as a point in this n-dimensional function space. We 
call the vector a = (ai, c* 2 ,..., a n ) the function vector in space X. In general, for a 
given subspace X n , our goal in identifying an approximating n-dimensional function 
x n becomes that of finding the coefficients of its function vector a. 

An expression such as Equation 16.56 is sometimes called an expansion for the 
vector (or function) x. Thus algorithms that result in finding the coefficients a* are 
sometimes collectively called expansion methods. 

Now to create an annihilation rule, we want to choose the Ui and Wi so that 
Qx = Kx for all choices of a, ; that is, all functions x in the space X n : 


Qx = JCx 
N ,6 


po 

Y,Wix(u t )= I w(u)x(u)du 

i =l •'° 


(16.57) 


Since K is linear, we only need to annihilate the basis functions. That is, if we 
have chosen our points and weights so we compute the exact integral for each basis 
function, then linear combinations of the basis functions (that is, all functions in 
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the space spanned by those functions) will also be exactly integrated. So we have p 
conditions (one for each basis function) that may be written as 


pO 

^Wjftjui) =mi = / w(u)h\(u) du 
i= i 

N ~b 

^2 w ih,2{ u i) = 7722 = / w(u)h 2 (u) du 

i=l ' a 


N 

'*Tw i h p (u i ) =rn% 
i —1 



(16.58) 


These are called the undetermined coefficient equations, and the mi are called the 
generalized moments of the weight function w(t ) with respect to the bases {ft*}. This 
approach is called the method of undetermined coefficients . 

If p = iV, then we can write this as a square matrix equation wH = m, or in 
tableau form 


w 1 
w 2 


/ii(ui) hi(u 2 ) 
h 2 {ui) h 2 (u 2 ) 


w N 


h p (ui) 


ftl(ttyv) 


rrii 

• 

=: 

m 2 

ft p (u/v) 


m p 


(16.59) 


which has a unique solution for the rrii if the matrix H is nonsingular. 

To annihilate a particular space of functions, we need only choose a basis {hi} 
that spans that space. A particularly common choice is the space of polynomials. 
These are spanned by the monomials {raj : m<(£) = t*” 1 , i > 1 (note that these 
functions are not orthogonal). Using the monomials, we expect to find an 7V-point 
rule that will match all polynomials of degree TV - 1 or less. 

For example, suppose we choose N = 2, intended to match all linear functions. 
Select the interval [a, b] = [0, ft], the weight function w(t) = 1, and quadrature points 
t\ = 0 and t 2 = ft. The first two monomials are rai(t) = 1 and m 2 (t) = £, so we 
have 


w i 

i i ■ 


1 1 dt 
Jo 


h 

w 2 

° h 


r 

1 tdt 


h?/2 


r o 


(16.60) 
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The solution to this matrix equation is w\ = W 2 = h/2, yielding the familiar trapezoid 
rule , 

x(t) ds « w\x(t\) -I- W 2 x{t 2 ) = “-[#(0) + x(/i)] (16.61) 

z 

We can easily add more points to the rule, to annihilate increasingly higher-order 
polynomials. These are called the closed Newton-Cotes rules of degree N — 1. 

We can use any other basis {hi} that we want in order to create other rules that 
will annihilate other types of functions. Many such rules are covered in Delves and 
Mohamed [120] and Press et al. [348], 

One way to improve the accuracy of an integration with simple low-order rules is 
to use repeated rules . This involves breaking up a domain into pieces and applying 
a low-order rule to each piece, rather than one large high-order rule. For example, 
if we have N = 6, we could apply three trapezoid rules side by side rather than one 
rule of order 6 to the entire interval. 



16.7.3 Quadrature on Ixpanded Functions 

In the following sections we will often apply a quadrature approximation of the 
operator C to the expanded form of x. We will prepare for these operations by 
developing a shorthand notation for the results of the expansion and approximation 
now. 

We note that in general any function x in an infinite-dimensional linear space 
spanned by a basis {/ii} may be represented as a linear sum of these bases: 

oo 

x{t) = '$2a i h i (t) (16.62) 

1 = 1 

Since computing an infinite number of coefficients is impractical, we instead project 
such a series into a finite-dimensional linear space in order to work with it. The 
easiest projection operation is truncation , where we simply stop the expansion after 
n terms [26]: 

n 

x(t) « (16.63) 

2=1 

Depending on the situation, we can consider this an ra-term approximation of x in 
an infinite-dimensional space, or an exact representation of a function within an 
n-dimensional space. 

We begin developing our notation by replacing x in g = Cx with this finite 


16.7 Numerical Approximations 


813 


expansion in terms of n basis functions {hi}: 

9 -Cx 

n 

= cy^hi 

i= 1 
n 

= '%2a i (Ch i ) (16.64) 

i =1 

where we have used the linearity of summation and the C operator. So the transfor¬ 
mation of x can be accomplished simply by transforming the basis vectors and then 
recombining them with exactly the same coefficients as in the original expansion. 

To find those transformed basis vectors in practice, we will usually need to ap¬ 
proximate the result with a quadrature rule Q: 


Chi = (I - XIC)hi 

= hi- A (/Chi) 

= hi-\(Q + £ Q )hi (16.65) 

where the operator K, is replaced by the sum of a quadrature rule Q and its error £q. 
If we expand this expression and replace Q with its explicit quadrature formula, we 
find: 


(Chi)(t) = hi(t) - X(Qhi)(t) - X(£ Q hi)(t) 


r 


= hi(t) - A 


^ ^ 'U>m)hi{u rn ) 


L m=l 


-X(£ Q hi)(t) (16.66) 


Ignoring the error for the moment, the first two terms give us a way to compute an ap¬ 
proximation to the transformed basis vector. We call this approximate, transformed 
basis Pi(t), and define 


Pi = {1 - \Q)hi 

Q 

= h{(t) A ^ ^ w m k(t, u m )hi(um) (16.67) 

m= 1 


so including the error, we have 


Chi = pi — A £o)ii (16.68) 

We will more often work with the approximation Chi « (J — XQ)hi = pi. Notice 
that to evaluate (1 - AQ)/it, we only need the values of hi and the kernel /C at the q 
quadrature points u m . 
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In addition to the kernel and the quadrature rule, the functions pi(t) depend on 
the set of basis functions {/i;} used to represent x n . We can use the same kernel and 
rule to compute multiple sets of different pi for different basis functions, which may 
span equivalent or different function spaces. 


The Nystrom method finds an approximation to x(t) by numerical evaluation at a 
few particular points, and then iterating a guess at the function until it matches those 
points. The function is then interpolated to get its value elsewhere. 

We begin by recalling from the discussion of the Neumann series the basic iteration 
formula 


«£n+l — Q A/C3?n 

xo = g (16.69) 

If we expand the operators and the independent variable, we get a better idea of 
what’s required to evaluate one step: 


z n +i (t) = g(t) + A (ICx n )(t) 

= g(t) + A / k(t,u)x n (u)du 
J a 


(16.70) 


To evaluate this iteration we need to find values for g(t) (which we assume are 
available upon demand) and the value of the integral on the right-hand side. We can 
estimate this integral with a quadrature rule Q: 

x n +\(t) = g(t) + \{ICx n )(t) 

= g(t) + A {Qx + £Qx)(t) 

= g(t) + AH"; Wjk(t, Uj)x(uj) + (£Qx)(t) j (16.71) 

'*=l ' 

where £q is the error operator for the rule Q, and the rule is evaluated at N quadra¬ 
ture points Ui in the domain [a, 6]. We will now become brave and simply ignore the 
error term £qx . Then we have an iteration rule for an approximation x n +i: 

N 

x n +i(t) = g(t) + A Y^Wik(t,Ui)x n (uj) 

i=i 


(16.72) 
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Let’s look at the first couple of iterations of this formula: 
= 9(t) 

N 

X\(t) - g(t) + A ^2 Ui)x 0 {ui) 


i=i 

N 


= 9{t) + A Wik(t, Ui)g(ui) 
i= 1 
N 

X 2 (t) = g(t) + A ^2 u i) x 1 ( u i) 

2=1 

N r AT 

= g(t) + A u i)\9 W + A ^ W ™K u i » 


2=1 


m=l 


(16.73) 


The pattern that’s developing shows that to evaluate x n +i(t) requires only g(t) y the 
weights , the driving function values and the kernel values &(£, U*) evaluated 
at t and the quadrature points Assuming that we know or can find all of these 
quantities, we’re ready to set up the whole iteration. 

The N rules of type Equation 16.72 may be set up simultaneously in a matrix, 
x„+i = g 4- ANx n , or in tableau: 


^n+l(^l ) 


9(h) 


wik u 

W2k\2 

• * * WfiikiN 


x n (t\) 

X n +l(*2) 

= 

9(h) 

+ A 

w l k 2 i 

W 2 k22 

• 


Xn{i'2) 

Xn+l {^n) 


9(*n) 


w\k N i 


* * * ^N^NN 


X n {tN) 


(16.74) 

where ka = k(t{, Uk ). This matrix defines the Nystrom equations. If they converge, 
their limit xq is called the Nystrom solution for quadrature rule Q. The choice of Q 
can strongly influence both the speed of convergence and the value of xq . 

To find x(t) at a value of t that is not a quadrature point (t ^ U*), we can apply a 
noniterative form of Equation 16.72, where we use the Nystrom function xq as the 
interpolated function: 


N 

XQ(t ) = g(t) + A^w j fc(i,u i )x( 5 (u i ) (16.75) 

2=1 


We wrote out the right-hand side of this formula explicitly because we want to 
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emphasize that we can evaluate xq at any value of £, depending only on its value at 
a discrete number of quadrature points u*. 

Now even if Equation 16.74 does converge to a Nystrom solution xg, this func¬ 
tion satisfies the approximate formula (X — XJC)xq = g , which may not necessarily 
have much to do with the desired solution x of ( I — A/C)x = g . The question then 
is whether xq is close to x. The answer to this question is not simple, but we can 
sketch an outline of the error analysis. 

Define the error function eg in the Nystrom solution for rule Q as 

e Q (t) = x{t) -x Q {t) (16.76) 

We can express the approximation xq with its error from Equation 16.71: 

XQ(t) = g(t) + \ j k(t, u)xq(u) du - E u Xk(t, u)x Q (u) (16.77) 

J a 

where E u is the error from rule Q on \k(t, u)xq(u ) as a function of u for a given t. 
Subtracting this approximate solution from the desired solution x(t), we find 


€ Q {t) = X(t) - XQ(t) 


rb pb 

= g(t) + X / h(t,u)x(u)du —g(t) — X / k(t, u)xq(u)cIu + E u Xk(t, u)xq(u) 
J a J a 

,6 

= E u Xk(t,u)xQ(u) + A / k(t,u)[x(u) — XQ(u)]du 
J a 

= E u Xk(t,u)xQ(u) + X J k(t,u)eQ(u)du 

= b(t) + A f k(t, u)eg(u) du (16.78) 

J a 


This argument has the interesting result that the error eg(t) is given by a Fredholm 
integral of the second kind, with the same kernel as the original equation but a 
different driving term. In operator notation, 

(J — X/C)e Q (t) = b Q (t) (16.79) 


which leads to the error term 

e Q (t) = (1 - \IC)- l b Q (t) 

||e Q ||<||(Z-A/C)- 1 HM*)ll (16.80) 

Unfortunately, the norm of this error term includes 6g(£), which itself is defined in 
terms of the original function x and the quadrature rule. But if we can estimate 



16.8 Projection Methods 


817 


a bound on this term, then we can find a bound on the error eg. This tells us 
that the error in the Nystrom approximant xq is dependent on the operator norm 
||(X - A/C) -1 1| and the error in the quadrature rule Q. Given an operator /C, we’re 
stuck with its norm, but by choosing the quadrature rule carefully we can get ||6g(£)|| 
to have small magnitude, and thus get good accuracy from this method. Detailed 
discussion of the numerical performance of the Nystrom method on a variety of 
kernels with respect to a number of different quadrature rules is presented in Delves 
and Mohamed [120]. 


16.7.8 Msnts Cariu Quadrature 

In Chapter 7 we discussed the Monte Carlo method for evaluating integrals. This 
is a form of automatic quadrature rule, where we do not choose the quadrature 
points in advance, but generate them according to some algorithm that is intended 
to simulate a random process of some kind. This approach has the power of letting 
us easily sample adaptively, beginning with some initial set of quadrature points and 
then accumulating additional points as needed. The drawback is that an efficient 
Monte Carlo method (say one that uses importance sampling) requires us to know 
(or guess) something about the nature of the underlying signal. The goal of most 
Monte Carlo methods is to find the parameters that characterize a signal; the family 
of functions that are thereby parameterized must be chosen in advance. We will see 
that this situation crops up again below when we discuss projection methods, which 
also require the solution space to be predefined. 

Any method in this chapter may use Monte Carlo methods for evaluating neces¬ 
sary integrals. We will see below that Monte Carlo methods may also be applied to 
solving complex integral equations like Equation 16.1. 


16.8 Projection Methods 

In this section we will study a variety of methods for solving integral equations which 
all share a common feature: they project the universe of all possible solutions into 
some smaller set, so they are called projection methods . 

The general idea is that when we are faced with an integral equation, we will 
decide, before attempting to find a solution , what kind of function we want to use 
for a solution. If we think of all functions of a given class as elements of a function 
space, then any function in that space may be represented by a set of coordinates 
representing weights on a set of basis functions for that space. For example, we 
may decide that we want to find a solution in the form of a polynomial; then it 
would be described by a list of coefficients on the various powers of the independent 
variable. If we want to find a solution that is a sum of sines, then we need a list 
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of numbers giving the magnitude and phase of each sine wave that is combined to 
make up the function. The goal of each projection method is to find these numbers, 
which identify one particular function from a fixed class. If we decide to look for 
a polynomial solution of order four or less to solve a ID integral equation, and the 
real solution is sin(x), then we will be stuck with a poor approximation; we could 
re-solve with a trigonometric basis to find a better solution. 

In addition to selecting a space of functions, we need to select a set of bases for 
that space. Different choices of bases can affect the efficiency and accuracy of our 
algorithms. 

Many function spaces are infinite; a polynomial p(x), for example, may be de¬ 
scribed as an infinite sum of powers of x. To be representable on the computer, we 
need to stop somewhere. That means that our solution functions will always come 
from a finite-dimensional space . Finite-dimensional spaces have some nice prop¬ 
erties, but they have the severe limitation that sometimes the solution we seek lies 
outside our space (as in the fourth-order polynomial trying to match sin(x) above). 
The result of this limitation is that our solutions will almost always be approxima¬ 
tions to the real solution. The essential question then becomes one of how to find 
the “best” approximation; this in turn leads us to ask how to measure “best.” 

There are two parts to this measurement: what is measured and where the 
measurement is taken. In the integral equation x = g + A/Cx, it is not x itself 
that we typically measure, though that is what we are solving for. Rather, since 
x — g — XKx = 0, we look for an approximate x„ that minimizes this difference, 
which is itself a function. We can measure the magnitude of this function with any of 
a variety of norms. But as mentioned earlier, we must decide where to measure the 
error. Things get tricky because although our function x n is limited to a particular 
subspace (say the space of polynomials), the operated function Kx n in general will 
not be in that space. 

There are several reasonable places to compute the error. Perhaps the most obvi¬ 
ous is in the infinite-dimensional space of all functions. Another is in the subspace 
from which our solution is drawn; after all, since we can’t leave this space when 
looking for an answer, we might as well optimize there. Another approach is to look 
in the space of transformed functions /Cx. We will see all of these choices below. 

The essential points to keep in mind when thinking about the following projection 
methods are as follows: 

■ The solution space is of finite dimension. 

■ The solution space is chosen in advance. 

■ The basis for the solution space is chosen in advance. 

■ The means for measuring the size of the error must be specified. 

■ The space in which the error is measured must be specified. 
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We will see that some solution methods (such as Tchebyshev) actually search the 
solution space looking for the best function, while most of the others are able to find 
the best function in a single step (usually a matrix inversion). 


16.8.1 Prelection 

The methods called projection techniques are based on the idea of finding that 
function x n from a finite-dimensional space that “best” approximates the ideal 
solution x . Each technique has its own definitions for the function space that is 
considered, and for the meaning of “best.” 

Recall from Section 16.7.3 that a function x n living in a finite-dimensional func¬ 
tion space may be written as a linear combination of the basis functions {/i*} that 
span that space: 

n 

= (16.81) 

1=1 

Each of the methods in this section will make use of a projection operator . If 
X is a normed space and X n c X is a nontrivial subspace of dimension n, then a 
bounded operator V n : X -» X n with the property V n x = x for all x G X n is a 
projection operator from X to X n [254]. In other words, a projection operator takes 
a vector in a space X and turns it into a related vector in a smaller space X n ; if x 
is already in X n , nothing happens. A useful example is the orthographic projection 
operator, which simply removes all components of the vector x G X that are not in 
the space X n . If X n is spanned by a set of bases {/i,}, then the projection operator 
V n finds a linear combination of just these n bases to represent its input vector x: 

n 

V n x = y^ajhj (16.82) 

i=l 

In particular, when the elements are functions, the result is a function vector a that 
describes the projected function. 

For example, suppose we have a space X 2 of linear functions that is spanned by 
the two monomials h\(t) = 1 and h, 2 (t) = t . An input function x(t) = 2t 2 + 3t + 4 
would be projected into X 2 by dropping the quadratic term V n x(t) = 3t + 4, which 
is a function described by the vector a = (3,4). Since we now have a linear function, 
repeating the projection will have no effect. Since the result of the projection operator 
typically has less information than its input, projection is in general not invertible. 


16.8.8 Pictures of the Function Space 

Since we’re working in a Hilbert space, it’s natural to think of the various operations 
on functions in this space as constructions in the familiar Euclidean linear space of 
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PIOURI 16.3 

The vectors x and £x, and the planes 7r n , ttl, and n m 


vectors. Inspired by Arvo [14], in this subsection we will make a pictorial analog to 
the operations discussed above, which will also help us with some of the methods 
to come. In this section we will use the words vector and function interchangeably. 
Remember that a point in a function space may be interpreted as a function or a 
traditional Euclidean vector. 

We begin by imagining a 3D space X that contains all possible functions in our 
system. For example, this could be the space of quadratic polynomials, where a 
point (a, 6, c) corresponds to the function at 2 + bt + c. We will typically look for 
functions x that solve an integral equation within a subspace of X . In this case, the 
subspace will be a plane, which we call 7r n , as shown in Figure 16.3. The operator £ 
is represented in these pictures by a rotation about an axis R ; the transformed vector 
Cx is also shown in the figure. In general, if we find Cx for every x € 7r n , we will 
sweep out a new plane which we call ttl• The vectors x and Cx will in general not 
be colinear, and thus will describe a plane of their own; we call this 7r m . 

We will make use of two projection operators, V n and Vl , which project a vector 
orthographically onto the planes n n and tt^, respectively. 

An important property of the plane n m , illustrated in Figure 16.4, is that it 
contains the vector x - Cx. This is always true, even when x and Cx are colinear. 

To see how the operators C and V n interact, we begin with the subspace n n and a 
rotation axis R for the transformation £, as shown in Figure 16.5. We will illustrate 
£ on a vector by rotating that vector around R by some fixed amount. 

A number of transformed vectors using this axis are shown in Figure 16.6. Note 
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The parallelogram of x y £x, and x — Lx. 



PIOURI 16.8 

The plane n n and rotation axis R. 
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The effect of different operations on x. 


that all the vectors in this diagram lie in the plane 7r m . We begin with the vector 
x € X, and observe that its projection V n x is found by erecting a perpendicular to 
7 r n through x . The transformation of x given by Cx is found by rotating x around 
R, as shown. 

Now we can look at multiple operations on x. If we apply C twice in a row, 
we get CCx = C 2 x , which is simply the rotation of Cx around the axis R . The 
application of V n twice gives us V n V n x = V n x , since once x has been projected into 
7r n , further projection operations have no effect. 

The more interesting results come from mixed applications. If we transform and 
then project, the result is V n Cx . If we project and then transform, we get CV n x . As 
shown in the figure, these results are in general not the same; V n Cx by definition 
lies in 7r n , but CV n x rarely will. In other words, the projection and transformation 
operators V and C do not commute: CT n ^ V n C. 

In general, when we look for functions to satisfy an integral equation, we will 
want to search only within a particular solution space. We can either restrict our 
search to those functions explicitly, or we can do it implicitly. The former is typically 
accomplished by writing the unknown function x as a sum of basis functions in 
the space. The latter approach says that we can search through all x e X, but we 
immediately apply the projection operator V n so that only the part of x in the space 
7 r n is used in the computation. Applying the transformation C to this projection 






16.8 


Projection Methods 


823 


R 



The transformation C rotates about a vector R _L 7r n , and g G 7r n . 


gives us a new, transformed function, but in general this function will not be in 
7r n anymore, so we have to reproject it to get a result back in 7r n . The result of 
this complete operation is that we end up looking through functions V n CP n x , that 
is, projections of transformed functions drawn from the space 7r n , themselves the 
projections of functions x G X. 

Recall that we are concerned with solutions to the integral equation x — A Kx = g . 
For convenience we will set A = 1 in the following discussion, so we are looking 
for x that satisfy x — ICx = g. To measure the accuracy of a candidate solution for 
x, we can compute the magnitude of the error vector d = x — ICx — g . The smaller 
this vector, presumably the smaller the error. Of course, the norm that we use for 
computing the magnitude of d, and the space in which we compute it, will affect our 
results; we will return to these ideas below. 

Suppose we represent the transformation £ as a rotation around an axis R that is 
normal to the subspace 7r n from which we choose our solution functions, as shown 
in Figure 16.7. In other words, n n = 7r m . If the driving function g is also within the 
plane 7r n , then we can in general find an x such that x — Cx = g. This is a perfect 
solution, with an error vector size ||d|| =0 with any norm. 

Now suppose that g is not in the plane n n . Then, since x — Cx is stuck in this 
plane, we can never match g exactly, as shown in Figure 16.8. The best we can do 
is try to find a choice for x that minimizes the size of the error d . 

We can match a g £ n n if the transformation C gives us more freedom. Figure 16.9 
shows a different rotation axis R' that is in the plane 7r n ; now the plane n m spanned 
by x and Cx is no longer constrained to 7r n , and so the difference x — Cx can move 
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ft 



The transformation L rotates about a vector R _L 7r„, and g & 7r n . 


g = x - Lx 



The transformation C rotates about a vector R E n n . Now we can match g £ 7T n . 


through the 3D space of functions, eventually matching x — Lx = g, again driving 

M l = o. 

This sort of generalization is not usually an option. We are given the subspace n n 
in which to search, a transformation £, and a driving function g. There is often no 
x — Lx that can ever match g for this set of givens. We’re then back in the situation 
of Figure 16.8, where we simply do the best we can. A general construction for 
d is shown in Figure 16.10, where we have found the error associated with some 
potential solution function x. 

We mentioned above that although we want d to be “small,” the definition of 
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Lx 



Constructing d = x — Cx — g. (The figure shows —d for clarity.) 


“small” depends on the norm we apply to d to measure its length, and the space in 
which we apply that norm. Figure 16.11 shows three reasonable spaces in which to 
apply any particular norm. 

Perhaps the most straightforward place to compute ||d|| is in the space X of all 
functions. But we could argue that this is misleading; we aren’t after all allowed 
access to all functions. Perhaps a better place to measure the error is in the plane 7r^, 
by computing its projection VlcI. Carrying this line of reasoning further, we argue 
that we can only search for functions that are in the space 7r n , so perhaps we should 
measure the norm of P n d, directing us to the function V n x G 7r„ that minimizes the 
error in the space of possible functions. Each of these choices is reasonable, and 
gives rise to its own well-known algorithm discussed below. 


16.8.3 Polynomial Collocation 

The method of collocation is a technique for finding x n , given values for g(t) and 
x n (t) at a number of points U. Actually, we find the values of <3 describing x n 
such that (I - A IC)x n matches the constraints; we will return to this distinction in a 
moment. 

The essential observation behind collocation comes from the form of the quadra- 
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The error d in X, 7r n , and 7 tl- 


ture rule applied to an expanded function. Recall Equations 16.66 and 16.67, which 
defined this result, repeated here for reference: 


(Cx n ){t) = ^2anPi{t) - (£Qx)(t) 

1=1 

Q 

Pi(t) = hi(t) A ^ ^ w m k(t, u m )hi(u m ) (16.83) 

m= 1 

As mentioned before, we will suppose that we know the value of g(t) = ( Cx n )(t ) at 
the p points u*. We will also be brave and from here on simply ignore the error term 
(£qx)(£). Since we’re going to find the best solution we can, we know that its error 
will be the least in that class of functions. 

To solve for the values of a, we start by rewriting the first line of Equation 16.83 
to represent the p different known values of x at the points t^: 


n 

9k = g(tk) = Tl ctiPik 
1=1 


(16.84) 


This is just k linear equations (one for each value of tk) in the n unknowns (the 
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values a,); each factor p^ is the transformed basis function pi evaluated at t = tki 

Q 

Pik = Pi(tk) = hi{t k ) A ^ ^ Wmk^tki / U'm)hi(u rn ) (16.85) 

m=l 

The p equations may be expressed in matrix form, g = Pa?. In the finite element 
literature, the matrix P is called the mass matrix or stiffness matrix [203]. In tableau: 



P 11 P21 

P12 P22 

Pip 



(16.86) 


The values of a are then simply a = P -1 g. If p = n, and the matrix P is square and 
nonsingular, then we have a unique solution for a?, and hence x n . 

This construction is a great example of the power of linearity. The coefficients a? 
that describe x n may be applied either to the basis functions {hi} in the domain 7r n , 
or to the transformed basis functions {pi} = {Chi} in the transformed domain 717 ,. 
The matrix simply takes us from one space to the other. When we have the right a? 
describing x n , then we satisfy Cx n = g\ that is, we simply transform x n from the 
space 7 r n to the space C7r n = n 1 . The same alphas are applied to both spaces. 

The most important feature of this algorithm is that it tells us only about the 
transformed, approximate (X - A/C)x„, and nothing directly of the approximate 
function x n . In general, at any point t (including the points t = U where we solved 
for the function), the approximate solution will not match the ideal: x n (t) ^ x(t). 
However, as the number of collocation points U increases, the approximate solutions 
x n will converge to the ideal solution x [22]. As with all other techniques of this 
type, the choice of the quadrature rule Q used to evaluate the matrix coefficients can 
greatly influence the speed and accuracy of the results. 


Collocation 

The method described above is actually polynomial collocation , a special case of a 
more general method known simply as collocation [162]. We will summarize collo¬ 
cation here because it gives us the flexibility to use functions other than polynomials. 

Suppose that we have an n-dimensional space X n of functions, spanned by a basis 
set {/ifc}> and a projection operator V n from some larger function space X onto X n . 
Then the projection (V n x){t) of any function x(t) may be expanded out with respect 
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to these functions: 


k 

(V n x) = ^2(x\h k )h k 

i= 1 


Similarly, the projected version of x operated upon by 1C is given by 


k 

(V n ICx)(t) = '52(ICx( t )\h k (t))h k (t) 

i =1 

= ^( / k(t,u)x{u)du 
i =l 


h k {t)jh k (t) 


(16.87) 


(16.88) 


Recalling our basic condition x — A Kx = g , we first project our basic equation down 
onto the subspace X n : 


r n g = V n [(l- XtC)x] 

= V n x — A V n tCx 

= x n -XP n 1Cx n (16.89) 

where we have restricted our search of functions x n to those lying in X n ; thus, 
V n x n = x n because x n is already projected. Now if we know the value of this 
equation at n known points tj , we have n equations of the form 


X n {tj) - WnICXnitj) = V n g{tj) 


U / f b 

Xn{tj)-\Y,( / k(tj,u)x(u) du 

i=l 


hijh{ — ^ ] (g(tj)\h{) hi 
' i =i 


(16.90) 


This is the general formulation of collocation. 

We will now backtrack, find our solution in the space of polynomials, and show 
that it matches our previous result. We will select as our basis functions {hi} 
the Lagrange polynomials L 2 (t), which are a popular set of bases for polynomial 
interpolation [348]. These functions are given by 


Li(t) 


n 


j=l, j^i 


t-tj 


(16.91) 


The Lagrange polynomials for i = 4 are shown in Figure 16.12. Note that each Li 
is zero at all tj except U, where it has a value of 1. In symbols, 


Li(t k ) = S(i - k) 


(16.92) 
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MOURI 16.12 

The Lagrange polynomials for i = 4. 


To write Equation 16.90 in terms of the Lagrange polynomials, we start by finding 
the forms of the two relevant projections. Projected functions V n x may be expressed 

(v n x)(t) = jr{jr X (u J )L i (u j ))L i (t) 


(16.93) 


i=l x j=l 


= '*Tx{u i )L i (t) 


Using this result, the projected transformed functions V n Kx may be similarly sim¬ 
plified: 


(V n JCx)(t) 


±(f 

i =i XJa 


k(t, u)x(u) du ) Li(t) 


(16.94) 


so Equation 16.90 may now be written 


n / fb \ n 

x n {tj)~ A^( / k(tj,u)x n (u)du\Li{tj) = '^g(u j )Li(t j ) 

t=i '• Ja ' i=i 

Again noting that Z-i(tfc) = - k), we can simplify this to the n equations 


(16.95) 


fb 

— A / k(tj,u)x n {u) du = g(uj) 

J a 


(16.96) 


This is the set of polynomial collocation equations, and they match our previous 
result of Equation 16.85. 
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The advantage of using the general collocation equations is that we need not 
be restricted to the polynomials Li(t) for a basis. For example, spline bases are 
discussed in Atkinson [22] and Baker [26]. 


16.8*4 IclMbyilitv Approximation 

When we discussed the residual error function of Equation 16.19, we said that the 
choice of norm strongly influenced the approximate function found. In this and the 
following sections we will examine algorithms based on a number of different norms 
for measuring the approximation error x — A Kx — g. 

Perhaps the most straightforward norm is the Tchebyshev (or Chebyshev) norm, 
which is also called the L^ norm. For a function x over an interval [a, 6], this is 
equivalent to 

IMIoo = max |x(<)| ( 1 & 22 ) 

a<s<b 

To apply this norm to the residual function, we first write the residual in terms of 
the functions g and x n . Recalling Equations 16.19 and 16.20, we write 


r n = g- (I - \K)x n 

So the Tchebyshev norm on this error is given by 

||r n ||= max \g - (/- XIC)x n \ 

a<s<b 


(16.98) 


mm 


The function x n , which minimizes the norm of the residual, is the one we want. 
Recalling that x n is given by a vector of coefficients a*, the residual r n , 0 corresponding 
to the “optimal” a Q is then 


||r n , 0 || = min max \g - (/ — A/C)x n | 

Q a<s<b 




We now do the standard procedure of approximating the integral operator /C with a 
quadrature operator Q -I- £q, and then dropping the error: 


||r n?0 || = min max \g{t) - [(/ - A (Q + £Q))x n ){t)\ 

q a<s<b 

« min max |5(t) - [(/ - AQ)x n ](t)| 

a a<s<b 

Explicitly writing the expansion of x n over the n bases hi we have 


(M mm 


||r„, 0 || = min max 

a a<s<b 


9(t ) - ^aip<(£) 


i— 1 
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This type of problem is called a minimax problem, since we’re trying to minimize 
a maximum. Typically we evaluate the system at a variety of values for a, and then, 
on the basis of those measurements, the search is directed toward what is presumed 
to be the global minimum. Depending on the shape of the function space, the 
particular searching strategy chosen can influence to a large or small degree the final 
function chosen [121]. As always, the choice of quadrature rule Q will influence the 
final selection as well. 


16.8.5 UoitSqvarti 

In the next few methods, we can draw a picture for the goals expressed by minimizing 
a particular norm. 

Recall that the L 2 norm for a continuous function x(t ) is defined as 



\x(t)\ 2 ds 




We use this norm in the least-squares method to find an approximate solution x n that 
minimizes the L 2 norm of the residual. We will use this norm to measure any vector 
by projecting that vector into the space 7r^, the space of all transformed functions 
£x, as shown in Figure 16.13. 

The function x we seek is the one that has the smallest projected error vector 
VlA. The norm HT^dll goes to zero when d _L 7r that is, when the vector d is 
perpendicular to the space of transformed functions Cx. As shown in the figure, this 
can also be interpreted by saying that d is parallel to the normal of plane 7r^. 

It may be helpful to see quickly how this works in two dimensions before pro¬ 
ceeding to the general case. In 2D, the plane ttl is spanned by the two basis functions 
Pi and p 2 . We get the ball rolling with two simple observations. First, any vector 
v = x — Cx in 7T£ may be described by a linear combination of these bases. Second, 
the vector d = g — v is perpendicular to the plane We can write these three 
conditions symbolically: 

(d|pi) = 0 
(d|p 2 ) = 0 

v = Qipi + a 2 p 2 (16.104) 


Substituting for d in the first line, we find (g — v\p\) = 0. Expanding this, we find 

(fflPi) = Hpi) 

= (ctipi +a 2 p2|pi) 

= Ql <Pl|Pl) +CH 2 (P 2 IP 1 ) 


(16.105) 
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FIOU8I 16.13 

The vector d — x — Cx — g is projected into 7 tl. The length of this projection is zero: ||p£,d|| = 0. 


Similarly, we find for the second term, 

( 9 \P 2 ) = OLI (pi|p2> + <*2 (P 2 IP 2 ) (16.106) 

We now have two equations in two unknowns, the coefficients a\ and (* 2 . Solving 
for these, we get the vector v = x — Cx such that d = g - v is perpendicular to 7 tl- 
Using the L 2 norm as a general measure of distance, we can leave behind the 
2D imagery of Figure 16.13 and simply observe that the error vector from g to the 
transformed approximation solution Cx n must be perpendicular to the transformed 
space, which is spanned by the n transformed basis vectors {Chi} = {pi}. We can 
express this statement in symbols by asserting that the inner product of the error 
vector with every transformed basis vector is zero: 

(Cx n - g\pi) =0 i = 1,2,... ,n (16.107) 

There are n of these equations, one for each basis function hi . We can use the 
linearity of the braket to transform this into a more useful form: 


( Cx n \pi) = (g\pi) i = l,2,...,n 


(16.108) 
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We can now replace Cx n with Equations 16.66 and 16.67, which express the 
transformed function in terms of a quadrature rule applied to the function’s expanded 
form: 


(Cx n \pi) 
lPi ) 


y^Qfcp*: 

*=1 


^ak(PfclPi) 


k=l 


(P|P») 

(fllPi) 

(PlPi) 


(16.109) 


So now we again have n equations (one for each basis vector hi) in the n unknowns 
of the vector a. This matrix equation aP p = g appears in tableau form as 


oc i 
Ot2 

OLn 


(PilPi) (P 2 IP 1 ) 
(Pl|P2) (P2|P2) 

(PllPn) 


(Pn|Pl> 


(sIpi) 

• 

— 

(s|P2> 

(Pnl Pn) 


(ff|P3> 


(16.110) 


If the square matrix P p is nonsingular, then it may be inverted to produce a unique 
result for a, which gives us x n from x n = J2k=i a khk . 


16.8.6 Gaferkin 

The Galerkin method, also called the Ritz-Galerkin method and the method of 
moments , is similar to the least-squares approach except that it computes the error 
with respect to a different space. 

Recall that the least-squares method found the approximate solution x n e X n 
such that the transformed function Cx n was as close as possible to g . In the Galerkin 
method, we seek a transformed function Cx n that differs from g only in ways that 
cannot be represented in X n . In other words, we imagine that every vector in the 
system is projected into the subspace X n , and we ask for the function Cx n that is 
closest to the projection of g . 

This situation is shown in Figure 16.14, inspired by a similar image in Arvo [14]. 
The function x n is chosen so that the error vector between g and Cx n is perpendicular 
to the space X n . 

As we did with least-squares, we will take a quick look at the situation in 2D. The 
plane 7r n is spanned by the two basis functions h\ and / 12 , which may be combined to 



834 


16 INTEGRAL EQUATIONS 



The error vector from Cx n to g is perpendicular to the subspace X n . 


form every vector v = x — Cx in n ri . And our constraint is that the vector d — g — v 
is perpendicular to this plane n n . Symbolically, 

(d\ h\) = 0 
(d\ h 2 ) = 0 

v = aipi -b a 2 p 2 (16.111) 

We can now go through the same procedure as for the least-squares case, substituting 
for d, to find (g — v\h\) = 0. Then, expanding and simplifying, we find 

(si hi) = 

= (ociPi +a 2 P2\hi) 

= <*x(pi\hi) + a 2 {p 2 \hi) (16.112) 

Similarly, we find for the second term, 

(g\p 2 ) = ot\ (pi\h 2 )+a 2 (p 2 \ h 2 ) (IM11) 


As before, we now have two equations in two unknowns, the coefficients a\ and a 2 . 
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To state this approach in general terms, we can use the projection operator V n > 
which takes any vector v € X and projects it onto the subspace X n . Then as we see 
in the figure, the projections of g and Cx n are the same: 

V n Cx n = V n g (16.114) 

Recall that V n is not in general invertible; that is, the transformation from X to X n 
loses information. In particular, it loses all of the components of the vector that are 
not in X n . 

The goal then is to make the projected residual zero: 

||r n || = 0 (16.115) 

This statement is equivalent to saying that the error vector is perpendicular to the 
subspace. That is, the inner product of the error vector onto each basis vector of X n 
is zero: 

(Cx n — g\ hi) = 0 i=l,2, ...,n (16.116) 

Compare this to Equation 16.107; it only differs in that we are projecting the error 
vector Cx n — g onto the basis function hi rather than the transformed function 
Lh{. Following the same procedure as for least squares, we can first open up the 
expression into two smaller inner products: 


(Cx n \hi) = (g\hi) i = l,2,...,n (16.117) 

and then use Equations 16.66 and 16.67 once again: 

n 

^2°t k <p fc | hi) = (g\hi) t = 1,2,... ,n (16.118) 

fc=i 

In matrix form, these equations may be expressed as aP g = g, or in tableau: 


ol 1 
a 2 


(PilM (P2\hi) 
(Pi\h 2 ) (P2\h 2 ) 


(Pi | h n ) 


(Pn\hl) 


(Pn | h n ) 



(9\hi) 

— 

(sl h 2 ) 


(g\h 3 ) 


(16.119) 


As before, if the matrix P g is nonsingular, it may be inverted to derive a unique a. 

It’s not too hard to show that as the space X n picks up more dimensions, and 
thus approaches X , the Galerkin method will converge to the correct answer. We 
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start with Equation 16.114 and find 

V n g = V n Cx n 

g = V n (l - XK)x n 
g = V n x n - XV„K.x n 

g = (I - XV n K)x n (16.120) 

or, upon solving for x n , 

x n = (1 — X'P n K,)~ l 'P v g (16.121) 

where we have used the fact that since x n lies in X n , its projection into that space is 
an identity; that is, P n x n = x n . 

Now we can find the error between this approximation and the ideal x by simply 
subtracting the two. By rearranging the difference, we can express it as a factor of a 
simple term that goes to zero: 

x - x n = (1 - XK)~ l g - (I - A V n K)- l V n g 
= [(I - A/C)- 1 - (I - XV n K.)~ l V n ] g 

= {(I - XPnIC)- 1 [(I - A VnK.) - V n (l - A/C)] (X - A/C)- 1 } g (16.122) 

By expanding the internal terms, we can see how they cancel each other and lead to 
a simpler expression: 

X - x n = {(X - XPnIC)- 1 [1 - XPnIC - PnT + A P n K] {1 - A/C)" 1 } g 

= {(X-AP„/C)- 1 (I-P„)(I-A/C)- 1 } 5 (16.123) 

Choosing our spaces so that X n -► X as n -¥ oo, the projection operator P n ->■ X, 
so (X - P n ) 0, and the whole operator goes to zero, so x„ x as n -> Q [343]. 
In other words, as our space X n enlarges and includes more functions closer to X y 
the Galerkin method will find those functions and give us a better estimate of x. 

The Galerkin method may be enhanced by a single additional step, resulting in 
the iterated Galerkin method . The idea is to pass the estimate x g derived from the 
Galerkin method above through one step of successive substitution: 

x f g = g + X1Cx g (16.124) 

Note that the new x' g is not necessarily in the subspace X n . Although this one step 
can refine our approximation, it may be surprising to note that additional steps of 
iteration will not improve the quality of the approximation [343]. A detailed study 
of the Galerkin method for integrals of the type Ti may be found in Ikebe [223]. 
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16.8.7 Wovolth 

There are two major sources of computational expense in the techniques presented 
in the last few subsections. One is the cost of building the matrix, and the other is 
the cost of inverting the matrix and then evaluating the resulting matrix products. 

Recall that in Chapter 6 we saw that a signal projected onto a wavelet basis often 
had many near-zero coefficients on the wavelet basis functions. If we ignore these 
near-zero coefficients and treat them as zero, then we can ignore the multiplies and 
additions that we would otherwise perform. There is some error associated with this 
operation, but for the moment let’s assume that that the error is tolerable. If many 
of the matrix entries are nearly zero, then the result is a nearly accurate solution at a 
great savings in time. In fact, the savings are even better, because there are efficient 
algorithms for computing the coefficients and inverting the resulting matrix. 

In other words, rather than solve x = <j+/C:r, we instead solve W(x) = W(g+lCx ), 
or Wx = W 0 +(W/C)(Wx), where W is the wavelet transformation. When the kernel 
k is represented as a matrix, the projected matrix W/C is often sparse , or mostly zero, 
with only a few nonzero entries. The related operator matrix W£ = W(I - A/C) is 
also sparse. 

Several methods for computing the wavelet transform to solve Fredholm integrals 
of the second kind have appeared recently [7,8,42]. The basic idea is to discretize the 
kernel k(t, u ) into a finite matrix, and then take the wavelet transform of that matrix. 
Where the kernel is smooth, most of the wavelet coefficients will be near zero, since 
only the lower-order wavelets will be needed to capture the broad, slow undulations 
of the kernel. In regions where the kernel is discontinuous, or has appreciable high- 
frequency information, or is otherwise singular (a term discussed in Section 16.10), 
the higher-order wavelet coefficients will become significant. But this is true in those 
regions only . This is the big advantage of the wavelet transform over the Fourier 
transform. With Fourier, we could dispose of the high-frequency coefficients, but 
then we would lose all of the high-frequency information everywhere in the matrix. 
In general, disposing of this high-frequency information will change every element in 
the matrix, not just those where there is significant local high-frequency information. 
With wavelets, we have high-frequency information only where we need it, and the 
corresponding coefficients are near zero where the matrix is smooth. 

For a signal of 2 n entries, we will write Vi to represent projection onto a dis¬ 
cretization of size 2 l_n [7]. So Vo is the finest-level resolution that matches the 
original matrix. Recall that as with any projection method, we’re now working in a 
transformed space, solving not x = g + A/Cx, but the related problem 

ViX = V { g + A VJCPiX (16.125) 

where we have written V{X for x*. Let’s look more closely at the composite operator 
on x at level n, which we write as /C n = V n KV n . 

We start by finding what K o looks like in the rectangular (or standard) 2D basis. 
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Defining the resolution-changing operator Qi as 

Qi = Vi +1 - Vi 

we can then write 

n — 1 

= 7*0 + ^ Qi 

In the standard basis, we find 


i=i 


K n = V n KVn 

n—1 n—1 

= (Po + E &)£(7> 0 + £ Qi) 


i=l 


n— 1 


i= 1 

n— 1 


n—1n—1 


= votcvo+52 v o>cQi +52 Qilcr °+52 E QilCQk 


i=i 


»=i 


i=i fc=i 


(16.126) 

(16.127) 


(16.128) 


Schroder et al. have pointed out [384] that this form of representation can lead 
to inefficient projection of the signal, meaning that we’ll have more significantly 
nonzero coefficients than absolutely required. They note that this is due to the fact 
that different levels of projection are involved in these terms (e.g., Vo and Q*), so 
that compression at one level does not always completely exploit compression at all 
others. 

These problems are addressed by writing the composite operator in the square 
(or nonstandard) basis, in a so-called telescoping sequence : 


Kn = V n KV n 

n— 1 

= Vo>CVo + 52(Vi + xlCPi + i ~ ViKVi) 

t=0 

n—1 

= VoICVo + 52« v ' + &)£& + Q*) - WPi) 

i= 0 
n—1 

= VoICVo + 52( V ' ICV ' + Q>£^ + QilCV t + QiKQi - ViKVi) 

i= 0 

n—1 n —1 n—1 

= VoICVo + 51 QilCVi + Y, VilCQi + Y QilCQi (16.129) 

1=0 t=0 t=0 

Now we have an equivalent representation, based only on operators at the same 
level i working together, leading to more efficient compression. 

More general wavelet methods applied to the Galerkin technique are reported by 
Xu and Shann in [493], and efficient quadrature methods are described in Sweldens 
and Piessens [429]. 
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Method 

Matrix element i, k 

Nystrom 

WkHU.tk) 

Collocation 

pdtk) 

Least-squares 

(pk\pi) 

Galerkin 

< Pk\hi) 


TABLI 16.2 

Matrix elements for Nystrom, collocation, least-squares, and Galerkin methods. 


16.8.8 Discussion 

A review of the integral equation literature quickly reveals that the polynomial 
collocation, least-squares, and Galerkin methods are overwhelmingly the three most 
popular and influential algorithms for solving integral equations in general use today. 
The Nystrom method is an important addition to this list because it provides a 
simple and flexible alternative. The wavelet basis is a recent arrival and is still being 
developed, primarily to accelerate the performance of these algorithms. 

Each of these four methods may be characterized by the matrix it forms to solve 
a set of linear equations. Table 16.2 summarizes the matrix element ra** for each of 
these techniques. 

Other algorithms do exist; we mention two here as representative examples. 
Kantorovich's method [343] is a projection method. It produces a sequence of 
integral equations which, when solved by Galerkin methods, will converge to an 
answer for x more quickly than Galerkin applied to the original equation. 

Another technique is in fact a general strategy for improving quadrature-based 
solution techniques, and is called the method of iterated deferred correction [26]. The 
method begins by creating two estimates for x from the original integral equation: 
a crude estimate xo and a slightly better (and more expensive) estimate /' 0 . If both 
functions are evaluated at n points, then we can construct an n-element error vector 
e 0 = /' o — xo- It turns out that eo satisfies a Fredholm equation of the second kind 
using a kernel derived from the original kernel and the quadrature rule, so we can 
use any of the methods in this chapter to solve for eo. We now compute x\ = xo+eo. 
If we run x\ through the quadrature rule, it won’t in general be exactly equivalent to 
the xo + eo, so we can write e\ = x\ — (xo + eo). This gives us a new error function 
ei, which we can again solve for and add into x\ to create an X 2 , and so on. We 
repeat this process until the sequence appears to be converged, the accuracy is good 
enough, or patience is exhausted. 
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16.9 Monte Carlo Estimation 

We mentioned earlier that we can use Monte Carlo to evaluate the ID integrals 
required by most projection methods. In this section we investigate using Monte 
Carlo to estimate a solution to the full integral equation of Equation 16.1: 



(16.130) 


We know from Chapter 7 that in principle we can form a Monte Carlo estimation 
of the integral of any function that we can evaluate at specific points, but that this 
estimate is likely to have high variance and converge slowly (proportionally to y/n 
for n samples). When the problem is complicated and the kernel difficult to evaluate 
(as in our applications), this sort of naive sampling will prove so time consuming as 
to be useless in practice. To make Monte Carlo practical for this problem, we need 
some way to improve the efficiency of our estimates. 

Perhaps the most general variance-reduction is importance sampling , presented 
in Section 7.5.2. For convenience, we will review the idea very quickly here before 
moving on. 

Recall that to estimate an integral 



(16.131) 


we can rewrite the function m as the product of two functions m(t) = g(t)f(t) 9 
giving us the equivalent problem 



(16.132) 


This can always be done; trivially, we can set g(t) = m(t) and f(t) = 1. Often 
by construction f(t) is normalized, that is, f f(t)dt = 1. Then we can treat f(t) 
as a probability distribution function, and draw samples ti one at a time from that 
pdf. We evaluate g at each of these sample points, and then average together these 
samples of g(U) to form an estimate of M. 

The efficiency of this process (that is, the speed with which our estimates converge 
to the true value of M) can be improved dramatically by judicious choice of f(t). 
The basic idea is that the final integral is simply a scalar, made up of a sum of 
many smaller scalars (the g(U)). The scalars that make the most contribution to the 
final figure are those that are largest in size. So if we skew our sampling in such a 
way that most of the samples are drawn from where the function g(t) has a large 
magnitude, then we will sample the most important parts of the integral most of the 
time, speeding us to a reliable answer. We can achieve this by choosing a function 
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f(t) that is large where the original function m(t) is large; because / tracks the 
amplitude (or importance) of m, we call such an / an importance function . So by 
sampling f m(t)f(t) dt , we will tend to preferentially draw samples from m where its 
magnitude is large. This is called importance sampling . There are two big problems 
with this scheme: the first is that we’ve now biased our estimate so that the integral 
is no longer equal to M, because in general f(t) ± 1. The other problem is that in 
order to design the right /, we need to already know m. 

The first problem can be solved by scaling m by 1 //: 

M = 17(i) f(t)dt (16A33) 

Now we draw samples ti as directed by /, evaluate ra* = m(ti)/f(U) 9 and our 
resulting integral is unchanged. The second problem is much harder; if we knew m 
well enough to design an ideal /, we wouldn’t need to go through this whole process 
at all. In general, we guess at an / based on whatever information we can gather 
about the function m, and we use that guess as the importance function. We can 
update our guess over time as we learn more about m to improve our efficiency. If 
/ is close to the ideal importance function, our efficiency will be greatly improved. 
This does not mean that any function / is useful; if / is sufficiently far away from 
the ideal, our efficiency can be reduced far below that of naive Monte Carlo. 

The beauty of importance sampling is that the importance function / can be 
almost anything (assuming it satisfies the conditions of a pdf in Section 7.5.2). We 
can create / based on the function m, or by combining simple analytic functions, or 
by plotting the daily rainfall in some city; / need not be tied to any other informa¬ 
tion in the problem. We therefore have tremendous freedom in specifying what is 
“important” in any given situation. 

We will see that importance sampling can be of great abstract and practical use 
in solving complicated integral equations for rendering. This is because our integral 
equations describe the distribution of light in an environment, and we usually don’t 
care to compute equally accurate estimates everywhere in the scene: in particular, 
we don’t really care about getting precise results on surfaces we can’t see, as long 
as the results on the surfaces we can see are accurate. The result is that we can use 
importance sampling to direct our attention to find accurate estimates of light energy 
on the parts of the scene that matter to us for some purpose, and save the work in 
those regions of the scene that don’t matter. Our discussion in this section will be 
mostly based on Coveyou [105], Hammersley and Handscomb [183], and Kalos and 
Whitlock [239]. 

The material in this section has a very natural and compelling geometrical inter¬ 
pretation. Almost all of the equations can be understood in terms of particles moving 
through an environment, starting at some source and then bouncing off of surfaces 
until they are finally absorbed (or escape the system), distributing information or 
energy as they travel. This is, however, only one interpretation of the mathematics, 
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which are quite general and open to other interpretations. The appeal of the geo¬ 
metric interpretation is that it supports our visual imagination, develops intuition, 
and anticipates our ultimate use of this material. The disadvantage of stressing this 
one interpretation too hard is that it may suggest that it is the only interpretation; 
that may make it harder to recognize its applicability in other situations where it 
is appropriate. Cognizant of this risk, I feel it would be a shame to suppress the 
evocative geometric interpretation in favor of a more abstract and remote approach. 
So although the big picture in this section will generally be in abstract terms, the 
interpretations of the results and steps along the way will be strongly geometric. 


16.9.1 RandoM Walks 

Our goal is to find values for the unknown function x as specified by an integral 
equation such as Equation 16.1: 


x(t) = g(t) + A f k(t,u)x(u)du 


(16.134) 


We will find it useful to recall Chapter 12 and anticipate Chapter 17 and think 
of Equation 16.134 as a transport equation . That is, it describes the propagation of 
some unspecified “stuff” throughout an environment. In this section we will assume 
for the sake of discussion that this stuff is visible light energy. The light is generated 
by the driving function and the kernel k describes how the light is transported by 
discrete particles from one place in the environment to another. 

The most convenient terminology for the following discussion is based on the 
idea of particle state first seen in Chapter 12. The idea is that each particle may be 
characterized by some vector of attributes; if there are n scalar elements in the vector, 
then we can think of the particle as a point in an n-dimensional state space . We 
will find it convenient to think of the domain over which we evaluate our integral 
equation as a bounded interval T = [a, 6], which is tiled into a set of p intervals /*, 
which together cover the domain. Since these intervals tile the domain, the U are 
disjoint (they do not overlap) and complete (together, they cover the domain T). In 
a purely 2D world, such as that in Figure 16.15, each interval may be thought of as 
a piece of surface [202]. 

A complete state description of a particle can conceptually contain almost any 
information; a reasonable starting point might be position, direction of travel, energy, 
and time. To make the discussion below simple, we will eliminate all of these but 
position. When we say that a particle is “in” some state S v , we mean that it is 
just arriving at the surface associated with interval I v . This arrival can be due to 
the spontaneous generation of the particle at the surface if it is a source, or as a 
result of flight from some other surface I u (or its associated state S u ). The kernel 
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The domain [0,1] broken up into five intervals Iq to h that are complete and disjoint over the 
domain. 


k of the integral equation can then be interpreted as the probability that a particle 
will travel from state S u (a point u in interval I u ) to state S v (a point v in interval 
I v ). We can write this more explicitly as k(S u —» S v ). Note that we have placed an 
explicit arrow between the arguments to indicate the flow of particles. If we only use 
left-to-right arrows, then the kernel term which has so far appeared as k(t, u) will 
appear as k(S u S t ); notice the switch in the order in which the states are listed. 
In physical terms, the kernel represents the source distribution or reflectivity of the 
surface, giving the likelihood that a particle at one surface will travel toward any 
other. The value of our unknown x(t) is then the number of particles to be found in 
state S t ; that is, on the surface p for which t € I p . 

We will find it convenient throughout most of this chapter to refer to states simply 
as “state £,” or even simply t. We will maintain the arrow notation in the kernel to 
emphasize the direction of transfer (e.g., k(u —> v)), but we will often speak of states 
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t and u rather than S t and S u - This convention allows us to discuss the ideas and 
the math in terms of states, but still use similar notation as in the rest of the book. 

This simple definition of the state means that the kernel doesn’t have access to the 
direction from which the incident particle is arriving. The emission (or reflection) 
distribution may depend on many factors, but the incident angle is not one of them. 
We make this assumption only to simplify the discussion. The state description could 
easily include the incident direction, time, and energy of the particle, as well as other 
information such as all past states. The kernel k could then theoretically make use of 
all this information. So it is simply for convenience that we will consider the kernel 
that gives the probability k(u —► v) of transport from state u to state v to be identical 
for all particles in state u. 

We also note that the discretization of the domain into intervals is also simply 
for convenience. Ultimately, every state S t can be considered to represent only a 
single particle vector t rather than a whole range of such vectors; then all of the 
involved functions will be continuous. We will maintain this idea of continuity in 
the math because it is conceptually simpler, though in a computer implementation, 
discretization is an unavoidable fact of life. 

By thinking of particles as independent objects that assume a variety of states, we 
are led to an idea of particle history . We say that a particle is described by a series 
of states, { So , Si,..., S n }. The state So is variously described by the adjectives first , 
initial , birth, creation , and source. The state S n is variously described as last , final , 
death , termination , and absorption. The complete set of states is called the path 
history (or just path or just history) of the particle. 


16.9.2 Path Tracing 

It will be very useful for us to find a way to describe the states through which a 
particle passes during its history. In general, we can start following states at any 
point from creation to absorption, and can follow the history either backward or 
forward. We begin by following the history backward from some state t , which may 
or may not be the final state. 

We start with the integral equation in state transition form: 

x(t) = g{t) + J k(s -> t)x(s) ds (16.135) 

We now start following a particle backward from a state t ; this will give us a single 
estimate for x(t). We start by assuming that g(t) is a pdf in the variable t , and that 
k(s -» t) is a pdf in s for some given value of t. Note that the kernel is being used to 
compute the transfer from state S s to state St- 

We begin by picking a starting state for the particle: that is, we pick a t from the 
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pdf g(t) to get state S t . This one data point gives us the estimate 

x(t) « g(t) 4- J k(s — > t)x(s) ds (16.136) 

where we have a value for g(t ), but the integral term is still just a formal expression 
for which we have no numerical value. 

Now we want to estimate the value of the integral using Monte Carlo: 

J kt(s)x(s)da (16.137) 

where k t {s) is the one-parameter function of k(s — > t) for a given t ; that is, the 
probability of a particle successfully arriving at state t from state s. We know how 
to estimate this type of integral: think of k t (s ) as a pdf, draw a sample value s s 
from k t (s), and estimate x(s s ). Since we have a value of t from the first state S*, we 
have A;*(s), and we sample it to find the random variable s 9 . We can add this to our 
estimate from before to get 

x{t) « g(t) + J k t (s)x(s) ds 
= g(t) + x(s s ) 

= 9{t)+g{s s ) + J k(r s s )x(r) dr (16.138) 

where the integral on the right comes from reapplying the definition of x. We are 
now in state S*-i. Geometrically, we have a particle that has started at £, and has 
now “bounced” once off of the bit of surface containing s s , and is now directed 
toward the previous surface and state Sk- 2 , as in Figure 16.16. 

To estimate the integral on the last line of Equation 16.138, we again apply Monte 
Carlo, now selecting a r s from the pdf k s (r) (since we now have a value of s), and 
evaluating / at that r 5 , giving us 

x(t) « g(t) + g(s s ) + g(r s ) + j k{q -> r s )x{q) dq (16.139) 

and on it goes. 

If the kernel is not normalized, that is, 

<r(t) = Jk(t->u)du< 1 (16.140) 

then the particle has a probability a(t) = 1 — cr(t) of being absorbed at any given 
evaluation. At that point the recursion stops and we have our estimate for x. In 



846 


16 INTEGRAL EQUATIONS 



One bounce off of state Si . 


effect we are following the particle as it bounces through the environment, where 
each bounce is found by sampling the kernel, using the current state of the particle 
as the kernel’s first argument. This technique is called path tracing . Generating 
the series of states assumed by the particle is called creating a random walk for the 
particle, since it appears to unpredictably “walk” from one state to another. Here 
we have walked backward; we can in principle walk in either direction. 

As mentioned earlier, the kernel has a natural geometric interpretation as the 
reflectivity of a surface. The value k(t —► u ) is the likelihood that a particle currently 
at the surface including t will be propagated successfully to a surface containing u. 

The general idea behind using this approach in practice comes from realizing that 
one random walk describes the sequence of states taken on by only one particle. If 
we follow enough walks, then the number of particles that visit each state will begin 
to approximate the distribution that would be generated by a real set of particles 
controlled by the probabilities given by the kernel. 

This procedure has the advantage of simplicity, but if the absorption probability 
is small, then it might take a huge number of bounces for each particle to terminate. 
In rendering, the operations associated with each bounce can be very expensive to 
compute. Recall that each time a particle enters a state 5*, there is a probability ct* 
that it will be absorbed. Therefore, out of every N particles entering that state, a, TV 
will terminate and (1 — ai)N will continue on to the next state. So the likelihood of 
a particle making it to state S n is the combined probability of propagation at every 
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state leading there, or (1 — ai)(l — 0 : 2 ) • • • (1 — a n _i). This can be a small number 
indeed, but we still pay the same high cost to follow this particle as for some other 
particle path that might be more likely and more influential on our result. 

We can make things more efficient by replacing this swarm of identical particles 
with weighted particles. When a weighted particle moves from one state to another 
it carries with it a weight that describes the likelihood that this particle on this path 
is likely to occur. In other words, suppose that 100 particles enter some state 5, that 
has an absorption probability of a* . Then we can absorb lOOa* of them and follow 
the 100(1 — at) others, or simply follow one particle with new weight 1 — a*. 

We may be tempted to make this process even more efficient by terminating a 
particle when its weight (that is, its probability of continuing on to another state) falls 
below some threshold r. This would certainly save us work, but it would introduce 
a systematic bias in the answer, since all of those low-probability contributions can 
sum together into a meaningful contribution. 

Instead of just stopping our particle history when the probability falls below r, we 
stop probabilistically , and use a weighting scheme similar to importance sampling. 
First, we select a number N before starting the simulation: Coveyou et al. suggest 
2 < N < 10 [105]. To start a particle history, we create the particle with a weight 
wo = 1 . When this weight falls below r, we generate a uniformly distributed random 
number £ € [0,1]. If £ > 1/AT, then we say the particle is absorbed. Otherwise, we 
scale the weight of the particle by the factor iV, and generate a next state by sampling 
the kernel. This procedure is called Russian roulette , named after the lethal game. 
The idea is to cut down on the number of below-threshold particles that we follow 
by a factor of N at each below-threshold bounce; this prunes our particle count 
quickly. To compensate, we increase the weight of each surviving particle. When 
we compute x(t ), we don’t simply count the number of particles in each state £, but 
instead sum together their weights. 

Figure 16.17 shows an example of this procedure. We assume that each surface 
has a constant reflectivity of p < 1 in all directions, and that the threshold kicks in 
just after the third bounce; to get this behavior, we set r = p 2 + p/2. The particle 
starts in So at the source, with weight wq = 1 , since it is certainly emitted (a different 
normalization would start the particle with its probability of creation g(t)). The 
particle enters event S\ with certainty, so it contributes a weight of w\ = 1 . Its 
probability of continuing is given by p, so it assumes that weight, W 2 = p, as it moves 
out of S\ and contributes it to 52 . The probability that the particle survives the 
second bounce is p 2 , so entering state 53 we find w = p 2 < r; this triggers a round of 
Russian roulette in state S 3 . Supposing that we have survival, the particle is passed 
on with weight modulated by p but amplified by N ; that is, w& = ( Np)(p 2 ) = Np 3 . 
We suppose the same thing happens at state 54 , so the continuing particle now has 
weight W 4 = N 2 p 4 . Finally, at state S 3 the random number £ is above 1 /JV, so the 
particle is terminated at that state with the weight it had upon arrival: w% = W 4 . So 
this particle contributes the weights { 1 , l,p,p 2 , iVp 3 , N 2 p 4 ,N 2 p 4 } to the six states 


848 


16 INTEGRAL EQUATIONS 



PIOIIRI 16.17 

A five-step random walk. 


So through S 5 , representing the expected proportions of particles that would occupy 
those states if we simulated many more particles. 


16.9.3 Th# Importance Function 

In some applications, we don’t necessarily need the solution x(£) at all values of £, 
but only at some subset. This is often the case in computer graphics. The most 
obvious example in rendering is that we only need accurate solutions of the radiance 
equation on those surfaces that are visible; as long as those surfaces are correctly 
represented, we don’t really care what the accuracy is elsewhere. At other times we 
need to pay close attention to accuracy in shadows, or at shadow boundaries. For 
example, when simulating the light falling on a garden, we may not care about the 
illumination within large, well-lit plots, but we might care a lot about how much 
light falls over the course of a season on the boundaries between crops of different 
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types. In a simulation for an art gallery, we may not care about the light falling 
anywhere in the scene except on the surfaces of valuable paintings and sculptures. 
We can use the ideas of importance sampling to direct our solution technique to 
those locations in the environment where we need accuracy for any reason. 

The tricky part of this type of importance sampling in rendering situations is that 
we typically want to specify immediate importance. That is, we want to indicate 
explicitly only those surfaces where we need accuracy. But since those surfaces 
receive light from potentially everywhere in the environment, we cannot completely 
ignore the surfaces that we don’t explicitly indicate to be important. To reconcile this 
problem, we propagate importance through the environment, just as light is created 
from light sources. If a surface is important, then any surface that can contribute a 
significant amount of illumination to that surface is also important (though perhaps 
less so), and the surfaces that contribute to that contributor are also (though even 
less) important, and so on. Finding this potential importance everywhere in the 
domain will occupy our attention here. 

To see how to apply these ideas, we begin by considering how the driving function 
g(t) distributes its effect into the environment via the kernel /C [105]. In the following 
discussion, we will think of all our functions as probabilistic descriptions of the ideal 
functions. That is, x(t) will represent not the exact x(£), but an estimate that we 
compute with increasing accuracy. Formally, we might write E(x(t)) and E(g(t)) y 
but we would soon have more £’s than anything else in our formulas. So keep in 
mind that because we’re dealing with random sampling, our solution functions are 
not exact functions but just probable approximations of varying accuracy. 

We therefore solve 

x(t) = g(t) 4 - J x(s)k(s —► t) ds (16.141) 

where x(t) is to be interpreted as the expected value of £, that is, the probability of 
finding a particle in state t ; x(t) is the probabilistic form of the particle density. The 
probable number of particles in a volume dt around t is then x(t) dt . We interpret 
g(t) similarly as the expected density of creation of particles at t. Note again that 
we are writing k(s —> t) so the flow of particles is left to right, like the reading of 
the equation; that is, x(s)k(s —> t) describes particles leaving state s and traveling 
through k(s —> t) into state t . We’re looking backward here; to find the value at x(t ), 
we look around at all the surfaces from which particles may have come to contribute 
to the density at t . 

We define the value x p (t) as the probability that the pth state of some particle 
occurs at state S t . The value xo(t) is just the probability of creation of a particle at t: 

x 0 (t) = g(t) (16.142) 

After one bounce, the particle moves from creation state S s to termination state S t ; 
the probability x(t) of finding a particle at state t after one bounce from any creation 
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A particle moving from state u to land at state w after p steps. 


state S s is then the probability of a particle being at state S s , times its probability of 
reaching S'*, summed over all such states S s : 


(16.143) 


x\(t) = J Xo(s)k(s -* t) ds 

The probability of finding a particle at state t at bounce p -h 1 is then 

Zp+iW = / x p {s)k(s->t)ds (16.144) 


So the probability x(t) of finding a particle at t on any bounce is then simply the sum 
of the probabilities for each bounce: 


oo 

x(t) = J2 x p( t ) (16.145) 

p=0 

Now let k p (u —> w) be the probability that when a particle is in a state u, there 
will be at least p more steps, and that the pth step will be in state v ; that is, this tells 
us how likely it is that if we start in state u and take p steps, we will end up in state 
w. We symbolize this in Figure 16.18, which diagrams the path of a particle from 
state u to state w, where each change in the direction of the path indicates a change 
of state, weighted by the kernel at that state used as a pdf. 
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We can write this symbolically as 


7 (u —> w) = J J v i)k( v i V 2 )" ‘ k(v p -2 — > v p -i)k(v p -i —► w) dv\ • ♦ 'dv p 


(16.146) 

Note that this is just an iterated kernel, as defined in Equation 16.44. 

As before, we can write a few special cases to get a feeling for this probability. 
We begin with ko(u -> w) y which is simply the probability that state u is the same as 
state w; this is just the delta function for S u = S w : 


ko(u -* w) = 6 (u -> w) 


where 


S(u —> v ) 


= f 1 

10 otl 


= s v 

otherwise 


The probability for one bounce k\ is simply the kernel k itself: 

k\(u —> w) = k(u -> w) 


(16.147) 


(16.148) 


(16.149) 


The probability of getting from state u to state w in p-l-1 bounces is just the probability 
of getting from state u to any state v in p bounces times the probability of moving 
from that state v to state w. In symbols, 


k p +i(uw) = J k p (u 


—> v)k(v -> w) dv 


(16.150) 


We integrate because any state v will do; this is illustrated in Figure 16.19 for two 
different states v . 

Finally, we can find the probability of reaching u from w in p + q bounces as 
the product of reaching some state v from u in p bounces times the probability of 
reaching w from v in q bounces, as shown in Figure 16.20. Symbolically, 

k p + q (u -> w) = J k p (u -* v)k q (v w) dv (16.151) 

We can combine these ideas of expected density and multibounce probability to 
give the following equalities: 


p— 1 (s)k(s 


t)ds = x p (t) = Jg(v)k p (v -4 t) < 


(16.152) 


In words, the left side tells us that x p (t), the expected number of particles to be 
found in state t on bounce p, is given by the expected number in some other state 



The transport from u to w via p — 1 bounces from u to any t>, and then one bounce from v to w. 


s times the probability that they will get from s to t in one more bounce, summed 
over all states s . The right side says that the same quantity can be found by summing 
the contribution of every source at v, where we determine its contribution at x by 
finding the probability that a particle starting at v will end at state t after p bounces. 
This equivalence is diagrammed in Figure 16.21. 

It is possible that a particle starting in state u may visit state v many times during 
the course of its history before it is finally absorbed. The Greens function G(u —► v) 
expresses this probability [105]: 

o© 

G{u = -> v) (16.153) 

p =o 

That is, G{u —► v) tells us how many times a particle in state u is likely to visit a state 



16.9 Monte Carlo Estimation 


853 



MOURI 1 6.20 

Traveling from u to v in p — 1 bounces, and then from v to w in q — 1 more bounces. 


v before it is absorbed. From this, we can write 


x(t) = 


J g(u)G(u -» t)du 


(16.154) 


That is, the expected number of particles in state t is given by the number of particles 
leaving a source u times the total number of visits each particle is likely to visit t , 
summed over all sources u . We note that we can use the Green’s function to derive 
other transitions using the p-bounce kernels; for example, 

G(u -» w) = S(u -» w) + f G{u -» v)k(v -» w) dv (16.155) 


We can multiply through by dw to get the number of particles in state w rather than 
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Two ways of finding the contribution of v to t. 


just the density: 

G(u -> w) dw = S(u — > w) dw 4 - J G{u —> v)k(v —> w) dv dw (16.156) 

In words, this says that the number of particles from state u that will move to a 
region dw around state w is given by the number of particles already in w (that is, 
we include the case u = w), plus the number of particles from u that will eventually 
make it to a state v times the probability that those particles will move from v to w 
in dw , summed over all such intermediate states v . 

So far we have only considered the transitions of particles from one state to 
another, as governed by the kernel k. It’s now time to use the importance function 
to improve the efficiency of this process for some particular purpose. 

Suppose that we aren’t really interested in the solution function for itself, but 
rather in some other quantity derived from the solution function. That is, we want 
to find some scalar value A defined by the expected density x(t) and some arbitrary 
a(t): 

A = J x(t)a(t)dt (16.157) 

Then as we reviewed earlier, we can normalize a and consider it as a pdf, generate 
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values t from this pdf, sample x(t), and average the samples to make an estimator for 
A. This will work no matter what a happens to be, as long as it can be turned into 
a valid pdf. But things become more interesting when a incorporates some of our 
purpose in evaluating the integral; that is, a is large in regions of the domain which 
we determine to be important based on any arbitrary definition of “important” that 
we choose. 

As mentioned earlier, the “important” parts of a scene in computer graphics can 
be anything: the visible surfaces from a particular point of view, the paintings on the 
wall of an art gallery, or the surface of a light-sensitive robot moving through the 
environment. Our goal in making an image is to compute the scalar values at the 
pixels; these are just linear functions of the light distribution in the scene. All of these 
examples may be captured by Equation 16.157. By placing an importance function 
on the entire environment, and giving it a large magnitude only in the regions where 
we desire a solution, we can force samples to occur in the regions we care about and 
thereby get an accurate answer in those regions more quickly. We will now look 
more carefully at the role of importance functions in solving integrals of the class 

T* 

We call the value A of Equation 16.157 the score or the total payoff [ 105,239]. 
As a particle travels through states { So , • • •, 5„}, it contributes some amount p(Sk) 
at each state; this is called the payoff or contribution of that state to A. The payoff 
function p is where we express our opinion of importance. If a state has a high 
importance, we assign it a large value of p. The total payoff for some particular 
particle is then given by the sum of each of these individual scores over the particle’s 
history: 

n 

77 = 5>(S fc ) (16.158) 

k =0 

Note that rj is not A; it’s only one estimate for A. If we trace many particles we will 
get many different values of 77 , but we are interested in the expected value of rj ; that 
is, the expected contribution of any particle is our value A from above: 

A = ( 77 ) (16.159) 

so A is the value of 77 averaged over all possible histories. 

Now consider a particle in state Sk- We can consider the total payoff of this 
particle over its entire life to be divisible into two segments: the payoff contributed 
before it reached state S&, and the payoff at Sk and all states afterward until it is 
absorbed. We call this latter contribution the remaining payoffs note that the payoff 
at state Sk is included in this definition. The remaining payoff is itself a random 
variable, which we write as f: 

S(S P ) = X>(S P ) 

p=k 


( 16 . 160 ) 
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The remaining payoff f of a particle in state S*; we sum the payoffs of the circled states only. 


Figure 16.22 illustrates the idea behind this definition. When the particle is finally 
absorbed, k = n, so £(S n ) = p(S n ). 

When the particle is in a state k < n, that is, before absorption, its remaining 
payoff may be expressed as the sum of the payoff at the current state plus the payoff 
yet to come: 


s(s k )= P (s k )+ y, p 

p=k +1 

= p(S k ) + Z(S k+l ) (16.161) 

The expected value of this remaining payoff will be represented with a function w y 
which is variously called the potential function , the value function , or the potential 
value function : 

w(s k ) = (s(s x )) oMm 

The value of w(S k ) tells us the amount of importance to attach to state S k . That is, 
it tells us if we have a particle in state S kj what the remaining payoff of that particle 
is likely to be in terms of our measuring function. In other words, w(t) is the value 
that a particle in a state t will eventually contribute to A if allowed to continue; this 
payoff waiting to happen is the potential of a particle at state t. 
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We can see this by expanding the formula: 

w(t) = m) 

= a(t)p(t) + J k(t —> u) \p(t) -f (£(u))] du (16.163) 


The first term is the payoff if the particle is absorbed at state t ; it’s the payoff at t 
times the probability of absorption. The second term is the payoff if the particle 
continues, found by computing the remaining payoff: we start with the payoff at 
state t , and add to it the payoff at every other state u to which it might go, times the 
probability that we would actually get a transfer from t to u, summed over all other 
possible states u. 

The most important thing to note here is that we are integrating over all states we 
might go to, rather than all states we might have come from , as in previous equations. 
This is expressed by finding the value at state t by use of the kernel k(t —> u), rather 
than k(s —> t). 

Recall that f k(t —> u) du = 1 - a(t), where a(t) is the probability of absorption 
at state t. That is, the probability of moving anywhere includes the probability of 
moving nowhere, or staying in state t without change. Rearranging this equation to 
pull together the terms on p(t ), we find 



(16.164) 


The final line here can be interpreted as saying that the importance of state t is 
equal to the immediate payoff we get by virtue of being in £, tirny^ the remaining 
payoff from every other state it, times the probability of getting to u from t. This is 
illustrated in Figure 16.23. This is an important observation; we will return to it in 
a moment. 

We can write down the value of w p over different numbers of bounces p as we 
did for E. The importance wo(t) at t if there are no bounces remaining, as we have 
seen, is simply the immediate payoff at t : 


wo(t) = p{t) 


(16.165) 


The payoff after p -f 1 bounces after t is similarly the p-bounce payoff at each state 
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riQURI 16.13 

The importance at t is its immediate payoff, plus the payoff summed over every state u it can reach 
times the probability of reaching u from t. 


v times the probability of reaching v: 

w p +\(t) = J k(t —» v)w p (v) dv (16.166) 

As mentioned earlier, the important difference between this equation for w p +i(t) 
and Equation 16.144 for x p +i(t) is that for importance we’re integrating the im¬ 
portance from all the future states where the particle might go, while for x p+ i we 
integrated the probability from all the past states from which the particle might have 
come. In some sense, w(t ) tells us how much a particle at state t can still contribute 
to our result if allowed to continue; x(t) tells us how many particles are in state t to 
deliver this potential contribution. 

The payoff from a particle in state t after p bounces may be written in terms of 
the chances of getting to each state v times the immediate payoff at v: 


w p (t) = / k p {t ^ v)p(v) dv 


(16.167) 
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So the total payoff from state t is the payoff after all of its bounces: 

n 

w(t) = ^w p (t) 

P =0 

= G(t —» v)p(v) dv (16.168) 

Now we return to our comment following Equation 16.164. Compare the inter¬ 
pretation of Equation 16.168 with that of Equation 16.164. Previously, we found 
the potential at t from the sum of the immediate payoff and the potential payoff 
everywhere we could go. Here the payoff comes from the chances of getting to state 
v times the immediate payoff at v. 

We see that w(t) is the importance function we referred to earlier. When we decide 
what domains to make “important” in a problem, we assign importances p(t). Like 
light sources, these “importance sources” distribute their importance throughout 
the environment, delivering a potential importance to each surface. In general, this 
flow of importance is completely unrelated to the flow of “stuff” (e.g., light energy), 
which is used to find x(t). The paths of particles distributing importance to w(t) and 
those distributing light to ar(£) will be unrelated to each other except for their use of 
related kernels. Exercise 16.7 looks at this distinction. 

Recalling our estimator A = ( 77 ), we find 


A = ][>(S fe ) 

k=0 

= J g(t)w(t) dt 
= J g(t) J G(t —► u)p(u) dudt 
= JJ g(t)G(t —> u)p(u) dudt 


(16.169) 


We now have three equivalent forms for A, each based on a different expression 
developed above [105]. 

1 From the expected density x, 


x(t) = g(t) + J x(s)k(s t) ds (16.170) 

the expected number of particles at state t is given by the number of particles 
created at t> plus the number of particles arriving at t from s, summed over all 
other states s ; this arriving count comes from the number of particles at state 
s times the probability of a transition from s to t. We can then say 


A = 


J x(t)p(t)dt 


(16.171) 
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The total payoff is found by summing over each state t the expected number 
of particles at t times the immediate payoff per particle at t. 

2 From the Green’s function G, 


G(u ->• v) = ]T k p (u v) (16.172) 

p=0 


we have an expression for the likely number of times that a particle in state u 
will visit state v. From this we write 


A = 


JJ g{u)G(u 


—► v)p(v) dudv 


(16.173) 


The total contribution is found from the number of particles made at each 
state u, times the likelihood that each particle will get to state v, times the 
immediate payoff at state u, summed over all states u and v. 

3 From the importance function w, 


w(t) = p(t) + J k(t —» u)w(u) du 


(16.174) 


The total potential of a particle in state t is the sum of its immediate payoff, 
plus the potential at every state u to which the particle might go times the 
probability of that transition. Then we can say 


A = 


J g(t)w(t)dt 


(16.175) 


We can find the total payoff from the number of particles created at state t 
times the potential payoff of state t. 

Recall that an inner product such as A is just the sort of thing we are looking 
for in image synthesis; it might be the color of a pixel on the screen, or the amount 
of light striking a surface in the environment. We now have three different ways of 
looking at how to compute A, based on the number of particles in each state and a 
function that tells us how much each state can contribute to A. 

The most important forms are the first and last in the list above. In the first 
form, particles are generated from the sources and then propagated throughout the 
environment. We find the total payoff to our measurement function by first finding 
this distribution of particles, and then weighting the number of particles at each state 
times the immediate payoff in that state. The third form finds the number of particles 
created in each state, and then finds the total payoff by weighting these particles by 
the potential payoff in that state. 
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Gathering these two expressions together, 


x(t) = g(t) + J 
w(t) = p(t) + J 


x(s)k(s —► t) ds 
k(t —► u)w(u) du 


or in operator notation, 


(16.176) 


x = g + Kx 

w=p + IC*w (16.177) 

where 1C* is the adjoint of operator /C, so the two equations above are said to be 
adjoints of each other, or a pair of adjoint equations. As we saw above, these adjoint 
equations lead to the source-importance equality: 

( x\p) = (g\w ) (16.178) 

This important equality says that we can find our integral by looking at every state 
£, and either multiplying the eventual distribution at the state times the immediate 
payoff per particle, or multiplying the immediate (source) distribution times the 
eventual payoff per particle. 

This equality also tells us that if we know w , then we know and vice versa, 
for a given p and g . But both functions x and w come from integral equations in the 
class JF 2 with similar (adjoint) kernels, so it’s no easier to find one than the other. 

We can now find estimates for the integrals defining A by using traditional Monte 
Carlo importance sampling [239]. We start with the linear Boltzmann equation from 
Equation 16.1: 

x(t) = g(t) -f J k(s —> t)x(s) ds (16.179) 

Our goal is to find a new kernel K which will bias our random walk toward the 
“interesting” or “important” states, as specified originally by p and distributed into 
the environment as w. 

Suppose we have an importance function I(t). For convenience, we will define 
a normalization constant go over the sources which amplifies each source by its 
importance: 

go = jl(t)g(t)dt (16.180) 

Now we will define a new function x(t) that will be the new, importance-weighted 
pdf we will sample, but that still has the same integral. That is, 

J k(s —► t)x(s) ds = J ^-ty x ( s ) x(s) ds 


(16.181) 
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We build this new function by multiplying the basic integral equation through by 

m/9o- 


x(t) = 

9o 

9o J go I\s) 

_ g(t)i(t) [ f Hs ->■ t)i(t) i(s)x(s) ds 

90 J 1(8) go 

= g{t) + J k(s —¥ t)x(s) ds (16.182) 

By construction, f g(t) dt = 1. So our desired value A may then be found from 



= f P(t)jfjjx(t)dt 

= goj j^i(t)dt (16.183) 

This means that whenever p(t) ^ 0, we require I(t) ^ 0; that is, we are simply 
enforcing the reasonable constraint that our importance function I must be nonzero 
anywhere we have asserted that the immediate importance p is nonzero. Now we find 
A by drawing values of t from x(£), which is a kernel weighted toward sampling the 
“important” states more often. We compensate for this distortion in the sampling 
process with the factor go/I(t). But so far we haven’t discussed how to find (or 
create) a good function I(t). 

A good I(t) would be big at those states where the contribution to the final 
integral is big, so it reflects the amount by which particles at state t can contribute 
to A. That is exactly the potential importance function w(x) derived above. That 
is, it’s the given importance p(t) propagated into the environment, identifying those 
states that will affect the integral. From above, 

w(t) = p(t) + f k(t —► u)w(u) du (16.184) 


We can derive the adjoint identity from above by multiplying through this impor¬ 
tance equation by the expected density x(£), and multiplying through the transport 
equation by the potential importance w(t), and integrating both equations over all 
states t : 



J p(t)x(t) dt + J J 


x(t)k(t —> u)w(u) dt du 
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J g(t)w{t)dt + jj w(t)k(s —> t)x(s) ds dt 


(16.185) 


The two double integrals on the right are equal, since we are free to rename the 
dummy variables in the integration. The two left-hand sides are also clearly the 
same, so the first left-side terms are equal, giving us the source-importance identity 
Equation 16.178. 

Now that we have developed some intuition behind the importance function 
and how it gets used, we can boil down the entire discussion into just a few basic 
equations, originally presented to the graphics community by Smits et al. [414], 
Pattanaik [332], and Pattanaik and Mudur [334]. The main advantage of this step 
is that it allows us to conveniently summarize our observations, and it makes further 
manipulations much easier. 

We summarize our adjoint equations as 


x = g -f fCx 

w = p + IC*w (16.186) 


We recall from earlier in this chapter that when the operator /C is finite, it may 
be represented as a matrix K . In computer graphics this will be a matrix of real 
numbers. Then the adjoint of K is its transpose; that is, K* = K l . In this case, we 
can show that (I - K) 1 — I - K l (see Exercise 16.6). So for a matrix L = I - K, 
L* = LK 

We can rewrite Equation 16.186 in finite dimensions in terms of the matrix L, and 
column vectors for each of the source and distributed quantities in Equation 16.186: 


Lx = g 

L l w = p (16.187) 


Equations 16.187 relate the given importance p to the distributed importance w, 
and the given source function g to the distributed information x. These quantities 
are related through two different uses of the same kernel. 

Before leaving this material, we will take a look at the error in A that comes about 
from using the finite approximate matrix operator L rather than the exact operator 
C [413]. We start with the scalar A = p*x (recall that both p and x are column 
vectors, so we need to transpose one of them to form an inner product): 

A(x) = p*x 

= (Z/w )*x 
= w 1 Lx 

= w*g (16.188) 

When we replace C with an approximation L, such as one of the finite-space 
approximations discussed in this chapter, we can write the approximation as the 
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sum of the exact solution plus an error term: L = C + A L. We might then have an 
exact solution x to the approximation integral equation: 

Lx = g (16.189) 

or, expanding the approximate operator, 

£x = g-ALx (16.190) 

Now, to see what happens to the integral, we find the effect of this change in C to 
the value of A. Since A is linear, 

A(x — x) = A(x) — A(x) 

= p*x — p f x 
= w*g — w *Lx 
= W*g - w # (g - A Lx) 

= w'ALx (16.191) 

So w* ALx is the error due to using an approximate C and x. If we knew w, we 
would know this error, but determining w means solving an integral equation of the 
same form required to find x; if we knew one, we would know the other. A useful 
practical technique is to estimate w and compute the error w* ALx. 

16.10 Singularities 

When the kernel (or free term) in an integral equation is smooth and continuous, 
the techniques described earlier in this chapter are appropriate. But when these 
conditions fail, the algorithms may converge slowly or fail altogether. We say that 
an integral equation is singular if one of the following conditions holds [120]: 

■ The domain of the integral is infinite or semi-infinite. 

■ The kernel has an infinite or nonexisting derivative. 

■ The kernel is discontinuous. 

■ The kernel or free term has discontinuous derivatives. 

These conditions are known as singularities. In computer graphics we are mostly 
concerned with singularities of the third and fourth types. For example, consider 
the integral equation that determines the incident light arriving at some point on a 
surface. If there is a bright light source blocked by an object with a sharp edge, the 
illumination function will have a discontinuity at that edge, where the incident light 
goes from bright to dark. When an integral is not singular, we say it is regular. 
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It is extremely important to handle singularities if we want to get accurate re¬ 
sults efficiently. Unfortunately, there is no general method that handles all types 
of singularities in all situations; generally we must design a technique to suit the 
particular problem. This is the case in rendering, where singularities have received 
some focused attention. We will discuss some of these specialized methods in Chap¬ 
ter 17. For now, we will give a brief overview of the general ideas that lie behind 
most singularity-handling algorithms. Our discussion is mostly based on Delves and 
Mohamed [120] and Kondo [251]. 

Singularities can be distinguished into two types: benign and malignant . Some¬ 
times a singularity in the free term can exactly cancel out a singularity in the kernel, 
resulting in a smooth solution function. Such cooperative singularities are benign. 
On the other hand, singularities that induce singular solution functions are called 
malignant , since they propagate into our results. In the following discussion we will 
only address singularities in the kernel. 

There is no single best way to handle all forms of singularities. A review of the 
existing tools suggests that there are six general approaches: 

Ignorance: Simply pretend that the singularity doesn’t exist. We might never even 
check for the presence of singularities in the first place. 

Removal: Change the kernel so that the singularity no longer exists, typically by 
subtracting a function that captures the singularity, and then include an error 
term to correct for the change. 

Factorization: Rewrite the kernel as the product of two terms, only one of which 
is singular. If we’re lucky, the singular kernel has a simpler form than the 
original and is amenable to repair by the other methods on this list. 

Avoidance: Change the domain of integration to side-step the singularity. 

Divide and conquer: Break up the domain of the integral into several smaller do¬ 
mains. Leave a gap (perhaps of zero measure) between domains where the 
singularity lurks. 

Coexistence: Accept the singularity in the kernel, and rather than try to change the 
problem, change the quadrature rule so that the singularity doesn’t cause the 
integration error to explode. 

These approaches do not all have sharp boundaries; for example, divide and conquer 
has much in common with coexistence. 

To give the flavor of these methods, we will discuss four of them— removal , 
factorization , divide and conquer , and coexistence —below. We will see some of 
these methods applied to the radiance equation in Chapter 17. 



866 


16 INTEGRAL EQUATIONS 


16 . 10.1 RgmgvoI 

Suppose the kernel is singular for all values of u for a given value of t ; for example, 
k(t , u) = l/(if — q) for some value of q . More generally, we can write such a kernel 
as 

fc(, ’ u) = 5rrjr (16 ' 192) 

for some nonsingular kernel /Co- We say that such a kernel has a pole of order m at 
u = q. We can rewrite the kernel to reduce the effect of this singularity by subtracting 
a term; this operation on the singularity is abbreviated in the list above as removal . 

We add and subtract a term K&(t,q) to the kernel and then rearrange the terms 
as follows: 


[ 6 k(t, u)x(u) du = f b [ko{t ’ M)a:(u) - * o(i ’ + 9)x(9)] 

J a J a 

-f 


(u — q) m 
rb [ko(t, u)x(u) — ko(t,q)x(q)] du 


(u - q)" 


+ k 0 {t,q)x{q) 


_ r du 

Ja ( U-q) r 


(16.193) 


If m < 1, then the first term is regular at t = q, because the numerator goes to zero 
faster than the denominator. The second term can be integrated exactly: 


/ 


du 


q — u 


G ( u-q) m (m — l)(u — q) r 


(16.194) 


If m > 1, then neither integral exists, and we have to resort to other techniques to 
handle the singularity. 

Another common form of singularity is along the line t = u. We can weaken this 
type of singularity with a more direct form of subtraction, by simply removing x(t) 
at t — t : 


pb pb 

/ k(t,u)x(u) du = / k(t, u)[x(u) — x(t) + x(t)] du 
J a J a 

pb pb 

= / k(t, u)[x(u) — x(£)] du -f / k(t,u)x(t)du 
J a J a 

= / A :(t,u)[x(u) — x(t)]du + x(t)r(t) (16.195) 

J a 


r(t) = r k(t, u) du 


( 16 . 196 ) 


where 
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is assumed to be known or can be found. 

In general, this type of approach works with the function that generalizes Equa¬ 
tion 16.192 to the form 


b{t) 

n(t-«r 


bjt) 

c(t) 


(16.197) 


which has n poles, each one located at a qi with order m *. We want to find a function 
x(t) = b(t)c(t) to knock out the singularities. 

It turns out that for many functions this weakening of the singularity can be done 
in a straightforward way, but it’s algebraically messy and not very illuminating; the 
details may be found in Delves and Mohamed [120]. 


16.10.2 Factorization 

Sometimes a complicated, singular kernel can be considered the product of two 
simpler kernels, one regular and one singular. The simpler, singular kernel may be 
easier to work with. On the list we call this approach to a singularity factorization. 
The goal is to factorize the original kernel k(t, u) as the product of two kernels, 


k(t, u) = p(t, u)ko(t, u) 


(16.198) 


where p{t,u) is an ill-behaved function that contains all the singularities in the 
original kernel, while k 0 {t, u) is well-behaved , or regular. We will treat p(t, u) exactly, 
and approximate ko{t,u) as usual, say with a quadrature rule. A drawback of this 
approach is that we need to be able to find some information about the function p, 
say its moments with respect to the monomials: 


rriij = 



(16.199) 


If we can find these moments, then we can build a set of rules based on the method 
of undetermined coefficients [120]. 

In extreme cases, Equation 16.198 may be extended to represent the sum of many 
such products: 

N 

k(t , u ) = u)k 0 ,i{t , u) (16.200) 

i=1 

where, thanks to linearity, we can break up the total integral operator K into a sum 
of smaller integrals, and attempt to handle each pi individually. 
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16.10*3 Divide and Conquer 

If the singularity can be handled by splitting up the kernel, then because the kernel 
operator is linear, we can write the original operation as the sum of two smaller 
integrals. For example, if the singularity is tightly localized within a disk of radius e 
around a value u = q, then we can write 

rb /»<?—e pb 

I k(t,u)x(u) du « / k(t,u)x(u) du -f / k(t,u)x(u)du (16.201) 
Ja Ja Jq+€ 

This approach is listed above as divide and conquer . Note that the nature of the 
kernel over the two domains may be quite different; for example, it may be smooth 
over the first domain but complicated over the second. We may therefore choose to 
use different quadrature rules in the two domains. 


16.10.4 CotxistoiKt 

Rather than try to change the kernel or the original integral equation, we accept it 
as written and recognize that we will eventually need to use a quadrature rule to 
evaluate it. This represents another avenue of attack on the problem, since we can 
tune the rule to the singularity. We might say that by changing the quadrature rule 
in response to the kernel, we are coexisting with the singularity rather than trying to 
remove it. 

This approach has been used with some success in computer graphics. In particu¬ 
lar, the calculation of form factors in the radiosity method is particularly sensitive to 
singularities when two polygons abut. To solve this problem in Galerkin methods, a 
modified inner product has been used during the quadrature step [503], and a form 
of divide and conquer has been developed for general bases by adaptively tiling the 
domain [383]. 


16.11 Further Reading 

Excellent modern surveys of numerical methods for integral equations are offered 
in the books by Delves and Mohamed [120] and Porter and Stirling [343]; these 
books are the primary references for this chapter. A good concise summary article 
is Golberg’s dskwssi^n in [162]. An extensive literature survey to 1976 has been 
compiled by Atkinson [22]; a survey directed to computer graphics was written in 
1993 by Arvo [14]. 

Good general references to integral equations may be found in the above books. 
More theoretical developments may be found in books by Kondo [251], Cochran 
[93], and Kanwal [240]. Many examples of integrals evaluated using a plethora 
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of different methods are given by Baker [26]. Issues of convergence are discussed 
by Sloan [413]. Error analysis is covered by Baker [26]; a rigorous discussion may 
be found in the book by Kress [254]. Multigrid methods for solution of integral 
equations are discussed in a thesis by Schippers [379]; an error analysis appears 
there and in the article by Hackbusch [176]. Singular integral equations form a 
subfield of their own. The classic reference on singular integral equations is the 
book by Muskhelishvili [315]. Another useful book in this area is by Mikhlin [303]. 
Descriptions of linear integral equation solution systems, complete with code and 
error analysis, are given by Atkinson [22] (also available in [21]), and Schippers 
[379], as well as many articles in the high-energy physics journals directed to solving 
nuclear transport problems. 

The literature on integral equations is enormous and expanding. Two compre¬ 
hensive bibliographies that capture the state of the literature to 1971 are Voigt’s 
[456] and Noble’s [324]; I know of no comparable bibliography on the subject since. 


16.12 Exercises 


Exercise 16.1 

Find the iterated kernel of order 2 for the kernel k(t, u) = su . 


Use the Fubini theorem to show that if k is independent of u, 


rS nU rS 

/ / k(v) dv du = (t — v)k(v) dv 

J u —0 J u=0 J u=0 


(16.202) 


Exercise 16.3 

Use a symbolic math program to help you compute the Newton-Cotes rule for N = 3. 
Use the interval [a, a + 2 h] and quadrature points {a, a + h, a + 2 h}. Does this rule 
have a common name? 

Exercise 16.4 

The Tchebyshev polynomial of degree n is written T n (t ) and is defined by 

T n (t) = cos(ncos -1 ^) (16.203) 

This may be expanded and simplified with trig identities to get the recurrence formula 

T n +i(t) = 2 sT n (t) - T n -i(t) n > 1 
T 0 (t) = 1 

Ti(t) = s 


(16.204) 
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(a) Write out expressions for Ti(t) through T 6 (t). 

(b) Use the method of undetermined coefficients to find the equivalent to the 
trapezoid rule (N = 2) using {T n (t)} as a basis. How do the rules compare? 

(c) Use the method of undetermined coefficients to find the equivalent to the 
N = 3 rule you found in Exercise 16.3 for the monomial basis, but use 
{T n (t)} as a basis. How do the rules compare? 

Exercise 16.5 

Derive Equation 16.152. 

Exercise 16.6 

Show that (I — K) 1 = I — K l for a real matrix K . 

Exercise 16.7 

(a) Suppose we distribute importance exactly along with the driving function; 
that is, p(t) = g(t). Describe how importance will flow through such a scene 
with respect to the particles used to find x(t). 

(b) Suppose we have a complete solution x(t) and set p(t) = x(t); describe the 
flow of importance in this situation. 



An object known as a camera .. .It pretends to 
be an ally, but it’s not. It’s a beckoning come-on 
for a quick walk around the block—in the 
Twilight Zone. 

Rod Serling 

( “The Twilight Zone*: “A Most Unusual Camera/’ 1940) 



THE RADIANCE EQUATION 


17* 1 Introduction 

In this chapter we combine the transport equation with radiometry to arrive at 
the radiance equation. This is the central equation of image synthesis, because 
it completely captures the distribution of light in a scene (limited to our built-in 
restriction to geometric optics). The problem is that the radiance distribution L is 
described implicitly , so we know what conditions it must satisfy, but we don’t know 
what it actually is. 

The process of shading in Chapter 15 was based on our being able to find a 
complete and explicit description of the light falling on a point. The heart of the 
image synthesis algorithms in this book is to use the radiance equation to find such 
a description. 
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1 7.2 Forming Hie Radiance Equation 


Recall from Chapter 12 the integral form of the transport equation expressed in 
terms of flux, which we expand here so we can see all the terms: 

$>(r,u;) = /Li(r,s) |6(s,a)) + J k(s, 3' —► u;)$(s, a;') d3' -f 

/ ^(r, a) e(a,u;)+ / A:(a, 3' u3)$>(a, 3') d3 f 

Jo l Js i 


da 


(17.1) 


where here and for the rest of this chapter we write a = r — a3. 

To find the radiance equation, we follow Arvo [15] and begin by expressing this 
function in terms of radiance rather than flux. Comparing the definitions of radiance 
and flux from Equations 13.12 and 13.7, we can see that radiance is just the flux $ 
times the energy E of the photons being carried: 


L = E$ (17.2) 

Each of the terms in Equation 17.1 may be expressed in terms appropriate for 
radiance by simply multiplying by the energy of the photons involved. 

So the big picture is that to find the radiance at a point r coming from a direction 
3 9 we find the nearest surface point s = /i(r, 3), compute its outgoing radiance into 
3 9 and accumulate all the radiance due to volume emission or inscattering along the 
way from s. Schematically, we can write this as 

L(r,3) = L(s —► r) + J L(a—►r)da (17.3) 

where s is the nearest point on a surface in direction 3 as seen from r, and a is swept 
by a to generate all the points along the line from s to r. 

We will now make three changes to this basic equation to promote it to the full- 
fledged radiance equation: we will replace the scattering function k with the BDF, 
and add phosphorescence and fluorescence . 


17.2.1 BDF 

The first change we will make to Equation 17.1 is to re-express the scattering function 
k in terms of the bidirectional functions of Chapter 13. Recall that those functions 
express outgoing radiance in terms of incident irradiance , so each right-hand term 
L(p, 3) gets replaced by L(p, 3) cos 0, where 6 is the angle between 3 and the normal 
to the surface at p. When p is a point in space, the “normal” is the direction in 
which we’re integrating. 
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We also use the bidirectional surface-scattering distribution function (the BSSDF, 
or simply BDF) /, which is a combination of the bidirectional reflection distribution 
function (or BRDF) / r , and the bidirectional transmission distribution function (or 
BTDF) ft . This means that we now integrate over all incident directions 0- rather 
than just the hemisphere 12* required for reflection. 

We will continue to use the scattering term k until Section 17.2.4 when we 
assemble the radiance-based flux equation. 

To satisfy BDFs that depend on polarization, we will also include an ellipsometric 
vector e in the description of the radiance L(r,u5). 


17.2.2 Phosphorescence 

Recall from Chapter 14 that phosphorescence is a phenomenon whereby a material 
traps incident energy for longer than about 10 -8 seconds before re-emitting it as 
visible light. Generally this re-emission has no directional character but is radiated 
uniformly in all directions; that is, it appears as perfect diffuse emission. 

To model phosphorescence we will break down the emission term in Equa¬ 
tion 17.1 into a blackbody (or thermal or incandescent) term e and a phosphorescent 
term e p : 

e(p,*,A) = 6fe(p, u;, t, A) + c p (p, u;, t, A) (17.4) 

The incandescent term comes from Equation 14.49, which gives the light from 
a blackbody in terms of its temperature T and the surrounding index of refraction 
7]{u). We will write the temperature as a function of position p and time t. Normally, 
incandescent radiation has no preferred direction; it is isotropic. But it is very 
convenient in computer graphics to associate directional characteristics with light 
sources; these can be due to low-level geometric and physical properties that we 
don’t want to explicitly include in our models. The easiest way to include these terms 
is to introduce a modulating function ra&(u;) into the expression for the blackbody 
emission, giving us 


eb( "’ *' P 4) = exp[WfcT(p,0]-l (17 ' 5) 

(recall v — c/X). For convenience, rather than work with the energy given by 65 , we 
will work with its related radiance L e . 

The phosphorescent term may be derived by simply modeling the behavior of 
phosphorescent materials. At any given moment, the energy absorbed at a point p at 
wavelength A is determined by the energy arriving from every direction Q € 0 • at A, 
times a phosphorescence efficiency function P p ( A). This energy decays over time as 
it is radiated according to a decay function d(t). So the radiance at a given moment 
is the result of all the energy absorbed in the past, times how it’s been since that 
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energy was absorbed. We model the absorption term by integrating the irradiance 
over all directions (giving us the total energy absorbed at that wavelength), and then 
scaling the result by the efficiency of the material at that wavelength. The total 
phosphorescent emission at a particular time is given by an integral over all time, 
which weights the absorption at a given time by the decay function since then. Like 
the incandescent term, we can add a direction-dependent modulation function m p 
into the definition to account for fine surface geometry. In symbols, 

€ p (p,i5,t,\) = m p (Lo) f d(t — t)P p ( p, A) f L(p,d;', A, t) cos O' du' dr (17.6) 

J — OO 

where 6 ' is the angle made by uY with respect to the normal at p. 

Equation 17.6 is missing a saturation component. After a certain point, the 
material cannot store any more energy at a given wavelength, and the excess is 
converted to heat or is not absorbed at all. So the scaling factor P p should actually 
depend on how much room is left for storing energy at a given wavelength. This 
would make the expression much more complex; our approximation is designed for 
low illuminations. 

Recall that a good candidate for d is the model in Equation 14.58 proposed by 
Leverenz [267]: 

L(t,L 0 ) = -—± - -2 (17.7) 

A material with no phosphorescence may be modeled with a discrete delta func¬ 
tion d(r) = 6[t — r]; that is, 


d(r) = 



t — T 

otherwise 


(17.8) 


17.2.3 PiverQfcence 

Recall from Chapter 14 that fluorescence is a phenomenon whereby a material 
absorbs light at one frequency and then reradiates it at another frequency within 
about 10 -8 seconds. Like phosphorescence, this re-emission has no directional 
character. 

To model fluorescence we change the scattering function to account for this 
transfer of energy from one wavelength to the next. Rather than simply integrating 
over all incident directions and weighting the energy at each one, we also integrate 
over all visible wavelengths and scale by a fluorescence efficiency P/{\' —► A) that 
models the transfer of energy from A' to A. In other words, we look in each direction 
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and scatter not just the energy at A, but the energy at all other A' that will be absorbed 
and reradiated at A. Symbolically, the scattering term becomes 



If a material has no fluorescence, then Pf may be modeled with a discrete delta 
function P/(A' —► A) = 6 [A — A']. 


17.2.4 FRI 

We can now put together the pieces above. We assume that there is no fluorescent- 
phosphorescent interaction; that is, energy absorbed at a wavelength A for fluores¬ 
cent reradiation at A' does not contribute to later phosphorescent reradiation at A'. 
There’s no mathematical reason not to model such an effect, but I have not seen it 
reported in the literature on luminescence [267]. 

To build the complete radiance equation we can replace the emission and scatter¬ 
ing terms in Equation 17.1 with Equations 17.6 and 17.9, respectively. This gives 
us the following complete (but formidable) result for the radiance at wavelength A 
arriving at a point r from a direction uj at time t: 


L(r, t3, A, e, f) = /t(r. s) L e (s,u?,i, A) 







(17.10) 





876 


17 THE RADIANCE EQUATION 


where 


a — r — auj 

/(s, A, Q r —► u5) = the surface BDF at s 
/(a, A, c2' <3) = the volume BDF at a 


Equation 17.10 is the Full Radiance Equation , which we refer to simply as the FRE. 1 

The FRE is frankly a very difficult-looking expression, but it’s just a collection of 
smaller pieces collected into two groups, and a wrapper that combines the groups. 
The main idea that shows the two groups is in Equation 17.3. All we have done is 
expand out those groups using the definitions above. 

The FRE can be tamed somewhat by putting it into operator notation. This 
doesn’t make it any easier to solve, but it is a bit easier to take in all the steps at once. 
The definitions of the operators come directly from their use in Equation 17.10. The 
operator form of the FRE may be written 

L = (M + V)[L e + VAL + KTL\ (17.11) 

where 

M represents the attenuation of radiance from point s, 

V represents the attenuation of radiance from point a between r and s, 

V is the phosphorescence operator, 

T is the fluorescence operator, 

A is the absorption operator, and 

K, is the BDF operator. 

There is no hope of solving the full radiance equation analytically for the function 
L, even if we had all of the other functions in a reasonable form. Much of practical 
image synthesis has been devoted to finding efficient and accurate approximations 
to solutions of this equation, by approximating either the solution, the equation, 
or both. That is, since an exact solution to the exact equation is intractable, we 
instead seek out exact solutions to an approximate equation, approximate solutions 
to the exact equation, or (most commonly) approximate solutions to an approximate 
equation. 


J The name radiance equation is similar to, but deliberately distinct from, the name rendering equation 
used by Kajiya [234]. Although Equation 17.10 could reasonably be called a “rendering equation,” the 
relationship given that name by Kajiya corresponds to our Equation 17.14, which is derived by a set of 
additional assumptions. I think that reassigning an existing name to a new equation is likely to cause 
confusion, so I have chosen a new name. 
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17.3 TIGRE 


The FRE in Equation 17.10 is enough to challenge even the strongest of heart. Since 
we know that an exact solution to the exact equation is unlikely to be found for 
a nontrivial environment, we need to start cutting corners somewhere. The most 
common approximation is to eliminate the terms for polarization, phosphorescence, 
and fluorescence. 

Eliminating polarization means that we are assuming that all light in the image 
is unpolarized. If we avoid coherent light sources like lasers, we can start without 
polarized light, but we know from Fresnel’s laws that at Brewster’s angle (recall 
Equation 15.11) light will reflect off of a surface linearly polarized. If polarization is 
important for a particular image, we can run linearly independent simulations and 
then combine them. 

By eliminating fluorescence we are asserting that all wavelengths are decoupled . 
That is, the solution to the radiance at wavelength A is independent of the solution at 
some other A'. This means that we can compute a color image by solving a simplified 
radiance equation several times at several different wavelengths, and then combining 
the results. It also makes it much easier to compute color images using basis functions 
rather than spectral samples, since we don’t introduce arbitrary transformations on 
the bases. We can’t leave out the wavelength altogether, since the index of refraction 
depends upon it, so it remains in the expression for the scattering function. The 
result is called a monochromatic or gray equation. 

By eliminating phosphorescence we are asserting that every instant of time is the 
same as every other instant for the system. The underlying 3D model may change at 
each moment, but when we solve the rendering equation we assume that the model 
has been frozen in position in an enclosed environment without any illumination, 
and that only when the simulation begins are the lights turned on. Recalling our 
terminology from Unit II, this means that the modified radiance equation is time- 
invariant . 

The resulting simplified equation is then 



(17.12) 


Equation 17.12 is the time-invariant , gray radiance equation , or TIGRE. Notice 
that TIGRE contains volumetric terms, so it accommodates participating media such 
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The geometry of T1GRE. 


as smoke and fog, as well as volumetric objects defined not by surfaces but by fields 
in space. The geometry behind TIGRE is shown in Figure 17.1. 

In operator notation, TIGRE may be written 

L = (M -f V)[L e + /CL] (17.13) 


17.4 VTIGRE 

The formulation of TIGRE in Equation 17.12 is much simpler than the full radiance 
equation, but it is often simplified even further by assuming that all synthesis occurs in 
a vacuum. Under vacuum conditions, the entire right-hand term on the right side of 
Equation 17.12 goes to zero: there is no volumetric emission, so L e ( a, u5) = 0, and no 

scattering or absorption, so /(a, cD' —>• tU, A) = 0, and /x(r,s) = exp fOdrl = 
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The geometry of VTIGRE. 


e° = 1 . The result is then 

L(t.uj) - L c (s.u3) -f [ /{s.tU' -4 d/ a A)£{s.J;') cqs0' dw f (17.14) 

Je J 


Equation 17.14 is the vacuum, time-invariant, gray radiance equation , or VTI- 
GRE. This equation expresses the same physics as the rendering equation introduced 
by Kajiya [234]. We chose not to use that name here because this is a special case of 
the full radiance equation (see footnote on page 876). The geometry behind VTIGRE 
is shown in Figure 17.2. In operator notation, VTIGRE may be written 

L = L e + JCL (17.15) 

When we are in a vacuum, we can refer to the radiance at r coming from s by 
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The geometry of OVTIGRE. 


simply indicating the direction vector Q along which light from s would arrive. Then 
we can write VTIGRE in a slightly modified form that describes the light radiated 
from r in an outgoing direction <2 in terms of the incident radiance from all directions 
and the emission at r itself. 

This modified form is then 


L(r,i2 0 ) = L r (s^ 0 ) + 


f /(r.a 

•t H 1 


1 2°, A)L(r. uJ) cos 0 r dw 




where u° € 0£, and 0 r is the angle between the normal at r and d;. This form is 
sometimes a more useful starting place for rendering algorithms. We call this the 
OVTIGRE form, since it represents the outgoing form of the VTIGRE assumptions. 
The geometry behind OVTIGRE is shown in Figure 17.3. 


17.5 Solving lor L 

The goal of image synthesis is to find the function L that satisfies a full or simplified 
version of the radiance equation. This is called solving the radiance equation . 
Generally this involves positing an image surface that is within the scene. The result 
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of the synthesis is an image that represents the light falling upon that image surface. 
Because most of our viewing devices are roughly planar, we typically imagine the 
viewing surface to be a viewing plane , though there is no restriction forcing that 
geometry [157,490]. 

We must decide whether this image surface is simply a hypothetical one through 
which light passes, or a real surface within the scene. If the image plane represents 
the film in a camera, then the camera model surrounding the film can influence the 
scene (e.g., by casting shadows or reflecting light). On the other hand, if the image 
plane is a theoretical construct, called a virtual image plane , then it is independent 
of the light distribution in the scene. The virtual image plane is the more common 
interpretation. This allows us to move the viewing plane throughout the image 
without the need to compute a new radiance function after each move. If, on the 
other hand, the image plane is to be considered part of the model, it must be explicitly 
modeled and included in the 3D scene description. 

There are two general approaches to solving the radiance equation, which we call 
explicit approximation and implicit sampling. 

In explicit approximation we do our best to find an explicit construction for the 
radiance; that is, we try to find an explicit function L(p, uj) that satisfies the radiance 
equation. To generate an image, we need merely sample the function on the image 
plane and use the signal-processing techniques of Unit II to process those samples 
into image values. This approach is exemplified by the radiosity algorithms described 
in Chapter 18. 

The implicit sampling method is directed at finding L only at point samples in 
phase space (that is, pairs (p,£)) that are known to be necessary for making an 
image. Typically these are points on the image plane, in directions determined by 
the viewing geometry. This approach is exemplified by the ray-tracing algorithms 
described in Chapter 19. 

Consider a scene with a virtual image plane. Because the explicit approximation 
describes the radiance function everywhere in the scene, we can move the image 
plane to any location and evaluate the radiance function over its surface. The 
solution function L is thus said to be a view-independent solution , and the solution 
method is said to be a view-independent algorithm. 

On the other hand, the implicit sampling method is often used to gather radiance 
only over the surface of the viewing plane; therefore, the image is a view-dependent 
solution. If that plane moves, then perhaps some of the old values may be reused in 
some way, but many new samples must be taken. Therefore this method is said to 
be a view-dependent algorithm. 

The names view-dependent and view-independent imply that they are intimately 
connected with the motion of the image plane in a scene, but that is only one 
interpretation of the results. If the image plane is modeled as a real part of the 
scene, then an explicit approximation algorithm will still construct a function that 
can be evaluated anywhere. Because it includes the image plane as a piece of the 
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scene, when the viewing plane moves, it changes the light distribution in the scene, 
and thus the radiance function must be recomputed. The solution in this case is not 
really view-independent. 

Therefore we prefer the more descriptive terms implicit sampling and explicit 
approximation , implying point sampling to find particular phase-space values of the 
radiance function, and evaluation of an explicit approximate function, respectively. 
We will look at these methods in more detail in the following chapters. 


17.6 Further Reading 

The original presentations of the radiance equation were given by Immel et al. 
[224] and Kajiya [234]. In particular, Kajiya gives an excellent history of various 
image synthesis algorithms presented in terms of the VTIGRE form [234]. Another 
discussion of the reduction of the flux equation to various forms of radiance equation 
is given by Arvo [15]. 


17.7 Exercises 

Exorcise 17.1 

A phosphor has a maximum possible amount of energy that it can store at a given 
wavelength, and any incident energy beyond that is converted into heat. Assume 
that this cutoff is sharp. Extend Equation 17.6 to account for this saturation. Is the 
resulting function linear? What does this imply about finding analytic solutions to 
the full rendering equation using this function? 

Exorciso 17.2 

Derive the form of the radiance equation appropriate for a medium that emits but 
does not scatter or absorb (this model is used by many volume rendering algorithms). 




RENDERING 


The surface of every opaque body is affected by 
the colour of the objects surrounding it. But this 
effect will be strong or weak in proportion as those 
objects are more or less remote and more or less 
strongly / colouredJ. 


Leonardo da Vinci 





INTRODUCTION TO UNIT IV 


I n Unit IV we concentrate on the process of taking a collection of 3D objects, light 
sources, and a viewpoint, and producing an image from that collection. This is 
the process of rendering . 

Our philosophy is that a rendering algorithm is a blend of ideas from vision, 
signal processing, and physics; each of these has had its own section in the book. 
We are now ready to draw them together into an interwoven program that produces 
accurate and good-looking synthetic images. 

There are many different types of rendering algorithms; there is no “best.” The 
field of rendering is both theoretical and practical. The theory starts with the radi¬ 
ance equation , which was our final achievement in Unit III. From there, we begin 
compromising and approximating: the full radiance equation is simply too hard to 
solve analytically for any nontrivial scene. The theory guides our algorithmic devel¬ 
opment, but we must pay heed to computer engineering and computer science as we 

go- 

This book is not intended as a practical guide to any particular rendering algo¬ 
rithm. Therefore I will not discuss issues that are important to a large-scale software 
project like a rendering system, including conceptual problems like architecture and 
implementation, and mechanical problems like reducing page faults. Such discus¬ 
sions may be found in the research papers describing the individual algorithms, which 
should certainly be consulted before you start programming. 

Rather than focus on complete descriptions, this unit emphasizes the big ideas 
and the major algorithms that are used in modern physically based rendering. (The 
term physically based means that the system is based on some model of physics, 
not necessarily the one that describes our world.) We will exclusively discuss global 
illumination algorithms, which take into account the distribution of light in the entire 
scene when deriving the color for any one surface point or image pixel. 

The field of global illumination is dominated by two major algorithms: radiosity 
and ray tracing . They are both based on solving the radiance equation, the for- 
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mer by constructing an explicit function that approximates the unknown radiance 
distribution, and the latter by evaluating point samples of that unknown function. 
These two algorithms make radically different assumptions about the environment. 
Classical radiosity assumes that everything in the scene is perfectly diffuse. Classical 
ray tracing assumes that objects are illuminated only by light coming directly from 
a light source, or via some intermediate bounces off of perfectly specular surfaces. 
We are actually fortunate that the two algorithms concentrate on such different phe¬ 
nomena, because they can be combined into a hybrid algorithm that uses the best 
features of both. 

The first chapters in this unit discuss the general ideas that make up radiosity, ray 
tracing, and the hybrid algorithms that combine these two. Our descriptions will 
generally stay at a descriptive, rather than detailed, level; much more information 
on all the algorithms discussed may be found in the references listed in the Further 
Reading sections. We will then discuss some methods for accurately displaying the 
resulting images and for interacting with those images to influence their design. 

This unit demonstrates how the first three units tie together to produce complete 
rendering algorithms. We need to understand vision so we know how to sample and 
display the image, we need to understand signal processing so all of our sampling 
is free from artifacts, and we need to understand physics and materials so we can 
accurately simulate the interaction of light with the objects in the scene. We have 
covered all of these ideas in preceding chapters; here we will weave them together to 
make pictures. 



All parts of creation are linked together and 
interchange their influences. The balanced 
rhythm of the universe is rooted in reciprocity. 

Paramahansa Yogananda 
(“Autobiography of a Yogi,* 1946} 



RADKOS1TY 


18.1 Introduction 

In this chapter we look at methods for creating explicit approximations to the radi¬ 
ance function L. The most famous of these methods is radiosity , which was originally 
developed for image synthesis by Goral et al. [165] and Nishita and Nakamae [321]. 
Like many other rendering methods, radiosity in its original form was extremely 
expensive in both memory and time; practical improvements to the method have 
greatly enhanced its popularity. 

The basic idea of explicit approximation methods is that from a given 3D scene 
description, we find some sort of explicit equation that expresses the distribution of 
radiance in the scene. Recall that the various radiance equations in Chapter 17 are 
all implicit expressions; they tell us the conditions that L must satisfy, but they don’t 
tell us how to find such an L. 

The development of practical methods for efficient radiosity is an active research 
area. In this chapter we present the basic ideas that underlie the methods, but we 
do not attempt to survey the practice in the field at the present time. Thoroughly 
detailed surveys with plenty of practical information are available in the references 
in the Further Reading section. 
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18.2 Classical Radiosity 

The method of classical radiosity was introduced by Goral et al. [165] and Nishita 
and Nakamae [321]. The basic approach assumes an environment that is in a 
vacuum, populated by purely diffuse, opaque surfaces. The first step is to subdivide 
the surfaces into smaller pieces; this is called meshing . Often we speak of a scene 
as containing some number of surfaces , but a larger number of patches , where the 
patches are created when the surfaces are meshed. We will use the terms surface and 
patch interchangeably. We assume that the reflected, incident, and emitted radiances 
are all constant in all directions at all points on each surface. Then at a given 
frequency, a single scalar represents the energy that is leaving each patch: this is a 
combination of the emission created by the surface (e.g., by blackbody radiation), 
and the radiosity B of the surface. The term radiosity is a synonym for the radiant 
exitance M = $/A, which is defined in Equation 13.10 as the power $ radiated per 
unit area A of a surface. 

The radiosity algorithm establishes a set of linear equations that relates the frac¬ 
tion of energy leaving one patch to the energy arriving at another, based on the 
geometry between the two patches. If two patches are close and facing each other, 
then a large amount of energy will be transferred between them. Two small patches 
far away from each other exchange little energy. It’s not enough to simply trans¬ 
fer power from the luminaires to the reflecting surfaces in the scene. Consider a 
luminaire A and a few surfaces M*, as shown in Figure 18.1. 

Light is generated at A and travels into the scene. When it strikes Mi, some of 
the light is reflected off that surface into all directions. Some of that reflected light 
will strike M 2 . In turn, M 2 will reflect some of that light, which will then fall back 
upon Mi, and so it goes. Eventually the system settles down; we say it reaches steady 
state or equilibrium . At that point the system is balanced ; we find that all the energy 
leaving the source is eventually absorbed by the surfaces. 

In order to establish such an equilibrium condition we need to make sure we can 
account for all the light. Therefore the environment is usually surrounded by a large 
surface called the enclosure ; that way no energy escapes from the system. 

The result of the radiosity process is that we know how much light is being 
reradiated by each surface (remember that the surfaces are uniform and perfectly 
diffuse, so they give off the same amount of energy per unit area, or radiosity, in 
each direction from every point). To create an image, we need only determine which 
object is visible through each screen sample; once we know the object, we can look 
up its precomputed radiosity and immediately convert that to radiance for display. 
If we move the image surface, then all we need to do is recompute which objects are 
visible through which pixels; the distribution of light on each surface is unaffected 
by our movement of a virtual camera. 

We have glossed over many important details in this short description, which 
we will cover in more detail below. One of the most important problems involves 




PIOURI 18.1 

Surface A is a luminaire, while the others are diffuse reflectors, (a) Light is radiated from the 
luminaire, (b) Light is reflected from M\ toward M2 as well as the luminaire, (c) Light is reflected 
from A/2 toward Mi and the luminaire. 
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converting a solution to make an image. Typically in classical radiosity we will 
compute the radiosity at the vertices of surface patches, and then blend the ver¬ 
tex radiosities across the surface when displaying using hardware-assisted Gouraud 
shading. In fact, we can navigate through a radiosity-processed environment very 
quickly with real-time hardware by associating the color of each surface element 
with its precomputed radiosity. One way to think about this process is that we have 
sampled the ideally continuous radiosity function in the scene at a number of points 
(the model vertices), and we are using a simple linear reconstruction filter (vertex 
interpolation) to reconstruct the signal. We know from Unit II that such a process is 
prone to all sorts of problems from aliasing to reconstruction errors, but often these 
are acceptable in an interactive environment. 

The radiosity method has its origins in the finite elements and heat transfer 
communities [406]. We will present it in its traditional computer graphics form, 
following the development by Cohen and Wallace [99]. 

We begin with the OVTIGRE radiance equation (Equation 17.16), which ex¬ 
presses the radiance L( r, uj) leaving from a point r from the direction a; as a function 
of the light emitted and propagated: 

L(r,<2°) = L € (s,u°) + [ /(r,d; —► u;°, A)L(r,u5) cos0 r d2 (18.1) 

y©* 


where Q° e 0£(r). 

The first thing we want to do is to express the incident light in terms of the light 
leaving other surfaces; right now we need L( r,u5) arriving at r, but we don’t have 
any way of evaluating it. We will use the same observation we used in deriving the 
integral form of the transport equation, which is that the point s visible from r in 
direction <2 is given by the visibility (or nearest-surface or ray-tracing) function */, 
such that s = r + v(r,s)2. Now we know that when the intercepted surface dS is 
far away, the solid angle dw in Equation 18.1 may be written 


d2 = 


dS cos 6 S 

Ik — s || 2 


(18.2) 


where 6 S is the angle between the normal at dS and the line from s to r. 

We can’t quite plug this into Equation 18.1, though, because we want to integrate 
over only the visible surfaces, not all points on all surfaces. There are two ways to 
limit the integration domain to visible surfaces. One is to explicitly construct a 
domain that is composed only of the visible surfaces. We can use any analytic 
visible-surface algorithm for this (such as that in Atherton et al. [20]), but such 
algorithms are often expensive for very complex scenes. An alternative is to build a 
masking function into the integral that limits the domain implicitly. In our case we 
want a visibility-test function V (r, s) that has the value 1 when r and s are mutually 
visible, and is otherwise 0. We can build such a function by finding the nearest 
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visible point from r in the direction of s, and checking to see if that is indeed the test 
point s: 


J 1 if s = r + i/(r, (s — r))(s — r) 
10 otherwise 


(18.3) 


We can now switch domains to all points on all surfaces M, knowing that only 
the nearest-visible points will actually contribute to the integral: 


L{y,uj°) = L e (s,£°) + f —► uj°, \)L(s,£) cosQ r duj 

Je\ 

— L e (s,uj°) + f /(r, Q -» u?°, A)L(s, w) cos 6 r C ° S — „ V (r, s) ds 
Jm ll r — s lr 

= L e (s,i3 0 )+[ f(r,w <2°, X)L(s, uj)G(r, s) ds (JM) 

JM 

where we have introduced the geometry term G to represent the purely geometric 
part of the computation. Notice that G is completely independent of the energy 
flowing in the scene. 

Equation 18.4 is not an efficient means for computing L(r,u; 0 ), since it involves 
lots of calculations that get weighted by 0 and therefore contribute nothing. We 
would prefer to integrate over M\ a subset of M that contains only those surfaces 
that are likely to be visible from r. 

The question now is how we might solve this integral equation. It expresses the 
radiance at a point and direction potentially in terms of all other radiances at all other 
points and locations in the scene. What we want is to find an explicit formula for 
L that satisfies these conditions. Classical radiosity is based on two of the methods 
that we covered in Chapter 16: polynomial collocation and Galerkin solutions. We 
will cover their application to Equation 18.4 in turn. 


18.2.1 Collocation Solution 

Recall from Section 16.8.3 that the collocation method is based on finding a set of 
collocation points for which we can find mutually consistent values of the unknown 
function. In our case, the collocation points are phase-space points c G = (r a ,d; a ) 
where a is an integer index. Then for all such points we write the value of L, and 
then solve the set of simultaneous equations. 

To get the ball rolling, we write the value of L at the collocation points c a . We 
write Equation 18.4 but we pull all the terms involving L to the left-hand side: 

L{c a ) - [ f(r,u a; 0 , A)L(s,o;)G(r a ,s) ds = L e (c a ) 

Jm 


(18.5) 
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Notice that one part of the geometry term varies with the integration points across 
the scene surfaces, but the other part is locked down to the collocation point r a . 

The next step is to expand the radiance with respect to a set of g basis functions 
{xl)b{ r, u;)}{J =1 . These functions are linearly independent over the phase space 7£ 3 <g)«S 2 , 
but they are not necessarily orthogonal. An gppi^ximali^ solution L can be 
written as a weighted sum of these bases: 


^ 9 

L(r,u) = ^2L b il> b (r,u) (18.6) 

6=1 

The goal now is to find L 5 . The most straightforward solution is to simply project 
the exact solution L onto the duals of the bases, but if we knew L we wouldn’t be 
bothering with collocation at all! So instead we find values for Lb that hold at the 
collocation points. 

To this end we substitute the approximate L in Equation 18.6 into Equation 18.5, 
resulting in L at the c a : 


9 p 9 

^2L b ip b (c a ) - / /(r,u5 -¥ u}°, A) ^ L b ij; b (s,uj)G(c a ,s) ds = L e (c a ) 
6=1 ^ M 6=1 

y^Z, 6 |^6(c a ) - j f(r,u -> w°, A)G(c a ,s)ds 


= L e ( Ca) 


We can write this more succinctly as a matrix relationship 

KL = L e 


(18.7) 

(18.8) 


(18.9) 


If there are g collocation points, then K is a square matrix. If K is nonsingular, 
then we can find our solution by inversion: L = K _ 1 L e . The expensive part of this 
method is evaluating the elements of the matrix K, and then inverting it. 


18.2.2 Gal*rkin Solution 

Recall from Section 16.8.3 that the Galerkin approach to solving an integral equation 
starts with the observation that we’re going to approximate the solution by a sum 
of weighted bases. The goal of the method is to find a solution that is orthogonal to 
each basis function. That is, in terms of Equation 18.6, 

(L|^6> = 0 (18.10) 

for all 6 , where the braket in this case expands to 

(f\9)= f [ f{T,uj)g{T,Q)dwdv 
Jm Js 1 


(18.11) 
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We will not explicitly conjugate the first argument of the brakets in this unit since all 
our functions are real. 

Writing out Equation 18.4 with the L terms on the left and expanded, we find 




f(r,uj —>• uj°,\)G(r,s)L 


rl) a ) = (L e \lp a ) 


for all values of a. Substituting Equation 18.6 for L, 

tp a \ - / [ /(r,u; ->• u3°, X)G{r,s)'y\L b ipb 

t^i • \ Jm 6=1 


i’a) = (L 


(18.12) 


1>a) (18.13) 


Grouping the common terms on L b and moving the summation outside the brakets, 
we find 


E 


u 


bU>a) ~ 


6=1 



/(r,u5 0°,X)G(r,s)'ip b 



(L e \*P a ) (18.14) 


Once again this may be expressed as a matrix equation KL = L e . If we can 
invert K, then we can find L = K _1 L e . As we would expect from the discussion 
in Chapter 16, the increased accuracy of the Galerkin method over the collocation 
method comes at the increased cost of computing the Galerkin matrix elements. 


18.2.3 Classical Radiosity Solotioa 

The classical radiosity method does not try to solve Equation 18.14. Instead, we 
make a number of simplifying assumptions that greatly reduces the difficulty of 
the problem. Some of these assumptions in turn imply other conditions about the 
radiance function, the surfaces, or both. 

In the classical method we assume the following: 

1 All surfaces are opaque. 

2 All surfaces are perfect diffuse reflectors. 

3 Surfaces are small. 

4 The radiosity across a surface is constant. 

5 The irradiance across a surface is constant. 

Because the surfaces are diffuse, we can drop the dependence on u in the basis 
functions. 

Because we’re dealing with purely diffuse surfaces, we can use the relationship 
B = Ln to divide through by 7r and transform the radiances L to the radiosities 
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B, and the BRDF / to p. That means the radiance coefficients Li become radiosity 
coefficients B». 

The classical method uses box functions b{ for the basis set. In this formulation, 
there is one basis function per surface patch in the environment. The function bi(r) 
is defined as 



r e Mi 
otherwise 


(18.15) 


so that it has the value 1 everywhere on patch i, and is 0 everywhere else. This means 
that our number of basis functions g is the same as the number of patches n. 
Because the basis functions are disjoint, 


(bi\bk) = SikAi 


(18.16) 


where Sik is shorthand for — k ); that is, it is 1 when i = k and 0 otherwise. Patch 
i has area A *, which is the result of integrating the unit function over the patch. 

The projection of the emitted terms is similarly 

(E\bi)= f E(p)dp = EiAi (18.17) 

J Mi 


where Ei is the emitted power per unit area on patch i . 
Finally, we can compute the remaining brakets from 


( Im ^ r ’ 12 A ^ r ’ S ^ bi 


bk) = — [ [ G(i, k)dkdi ( 18 . 18 ) 

/ * JMi JM k 


Putting all the pieces of Equation 18.14 back together again with these simplified 
values, we find 


TB k (s ik Ai-Pi [ [ S^i^. v(x,k)dkdi) = EiAi (18.19) 

V JMi Jm„ tt||i - k ll / 

or by noticing that the first term is simply Bi, 

B i A i = E i A i + Pi Y^B k f f C ° S ?- l - C 0 ^ V(lk)dkdi (18.20) 

JMi JM k ^ll 1 “ K ll 


The rightmost double-integral term in this equation is pure geometry: it depends 
only on the relationship between the two patches and not on the energy flowing 
between them. 

This value is called the form factor F and is useful because it represents the 
percentage of the energy radiated into the scene by i that reaches k . The form factor 
from patch i to patch &, written F^, is defined as 




cos Oi cos Ok 

m„ 7r||i- k ll 2 


K(i, k) dkdi 


(18.21) 
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PIGURI 15.10 

Vases rendered with the Cook-Torrance model. Reprinted, by 
permission, from Cook and Torrance in ACM Transactions 
on Graphics , fig. 7, p. 22. 



MGURI 1 5.1 7 

Three chairs made of varnished wood. The chair in (a) is a photograph. The chair in (b) 
was rendered with Ward’s isotropic reflectance model, that in (c) with Ward’s anisotropic 
reflectance model. Reprinted, by permission, from Ward in Computer Graphics (Proc. 
Siggraph ’92), fig. 8a-c, p. 271. 







FIGURI 15.19 

A velvet cushion made with the shading 
model of Westin et al. (©1992 Stephen H. 
Westin/Cornell University Program of 
Computer Graphics .) 


FIGURI 1 5.20 

A nylon cushion made with the shading 
model of Westin et al. (©1992 Stephen H. 
Westin/Cornell University Program of 
Computer Graphics .) 




FIGURI 1 5.28 

(a) A model for the atmosphere, (b) Earth, (c) Earth and atmosphere, (d) Earth, atmosphere, 
and clouds. Reprinted, by permission, from Nishita et al. in Computer Graphics (Proc . 
Siggraph ’93), figs. 6 and 7a-c, p. 181. (Courtesy ofTomoyuki Nishita , Fukuyama University , 
and Eihachiro Nakamae, Hiroshima Prefectural University.) 





FIGURI 1 5.31 

Results of the Kubelka-Munk theory, (a) A photograph of a real canvas painted with 
mixtures of cadmium red (top) and naphthol red (bottom). The tint concentrations, from left 
to right, were 2, 5, 10, 20, 40, 80, and 100% of dry weight by pigment, (b) A simulation of 
the canvas in (a) using RGB values to mix the reds and white, (c) A simulation of the canvas 
in (a) using the Kubelka-Munk theory. Reprinted, by permission, from Haase and Meyer in 
ACM Transactions on Graphics , figs. 7, 8, and 11, pp. 316-319. (Courtesy of Chet Haase 
and Gary Meyer, Department of Computer and Information Science, University of Oregon.) 




MGURI 1 5.35 

A head rendered by Lambert shading (left) and subsurface shading (middle). 
The right column is red where the subsurface model reflected more light, and 
blue where it reflected less. Reprinted, by permission, from Hanrahan and 
Krueger in Computer Graphics (Proc. Siggraph *93), plate 2, p. 172. 



MGURI 1 5.36 

A head made with the subsurface reflectance model. Reprinted, 
by permission, from Hanrahan and Krueger in Computer 
Graphics (Proc, Siggraph *93), plate 4, p. 172. 






FIGURI 1 5.37 

Three examples of hypertexture, (a) Eroded cube, (b) Fire 
ball, (c) Tribble. Reprinted, by permission, from Perlin and 
Hoffert in Computer Graphics (Proc. Siggraph ’89), p. 258. 






FIGURI 1 5.3 S 

A furry bear using Kajiya and 
Kay’s texturing method. (“Herbert 
the Bear** by J. Kajiya , T. Kay , 

/. Snyder. Produced at Caltech and 
IBM. ©1989 Caltech , /BMJ 



FIGURI 1 5.39 

Some buttons exhibiting 
displacement and color 
textures. Reprinted, by 
permission, from Witkin and 
Kass in Computer Graphics 
(Proc. Siggraph *91), fig. 3, 
p. 307. 



FIGURI 15.40 

Some mushrooms exhibiting 
displacement textures. 
Reprinted, by permission, 
from Witkin and Kass in 
Computer Graphics (Proc. 
Siggraph *91), fig. 5, p. 308. 













FIOURK 15.44 

A hierarchy of scales. (©1992 
Stephen H. Westin/Comell 
University Program of Computer 
Graphics.) 



FIOIIRI 15.45 

A bumpy teapot, (a) The yellow parts of the teapot indicate where the BRDF was used, blue 
indicates redistribution bump mapping, and red is displacement mapping, (b) The same image 
without color coding. (Courtesy of Nelson Max, the University of California, Lawrence 
Livermore National Laboratory, and the Department of Energy.) 





FIGURE 1 5.47 

The left column shows (from top to 
bottom) the result of using two, three, 
four, and five basis functions to model a 
scene; on the right arc four, nine, sixteen, 
and twenty-five equally spaced point 
samples for the same scene. Reprinted, 
by permission, from Peercy in Computer 
Graphics (Proc. Siggraph '93), fig. 6, 
p. 197. 



FIGURE 1 «.4 

An indoor scene with 607 surfaces 
solved by Galerkin methods. Reprinted, 
by permission, from Zatz in Computer 
Graphics (Proc, Siggraph *93), fig. 13, 

p. 220. 













Gauss-Seidel iteration after (a) one, (b) two, (c) twenty-four, and 
(d) 100 steps. Reprinted, by permission, from Cohen et al. in 
Computer Graphics (Proc . Siggraph '88), fig. 2, p. 80. 



FIGURI 18.12 

Southwell iteration after (a) one, (b) two, (c) twenty-four, and 
(d) 100 steps. Reprinted, by permission, from Cohen et al. in 
Computer Graphics (Proc. Siggraph '88), fig. 4, p. 81. 






FIGURI 18.13 

Progressive refinement after (a) one, (b) two, (c) twenty-four, and 
(d) 100 steps. Reprinted, by permission, from Cohen et al. in 
Computer Graphics (Proc. Siggraph ’88), fig. 5, p. 81. 



FIGURI 18.14 

A scene of 2,000 patches initially computed 
by progressive refinement. Reprinted, by 
permission, from Cohen et al. in Computer 
Graphics (Proc. Siggraph '88), fig. 9, p. 83. 


FIGURI 18.30 

A quadratic spline teapot. Reprinted, by 
permission, from Wallace et al. in Computer 
Graphics (Proc. Siggraph '89), fig. 11, p. 323. 





FIGURB 18.31 

The nave of Chartres Cathedral rendered with ray-traced form 
factors. Reprinted, by permission, from Wallace et al. in Computer 
Graphics (Proc. Siggraph *89), fig. 14, p. 323. 



FIGURB 18.54 

Results of hierarchical subdivision. The link colors indicate the degree of visibility between 
the two patches: white is completely visible, green and pink are partly visible, and dark blue 
is almost invisible. The three images show increasingly larger patches. Reprinted, by 
permission, from Hanrahan et al. in Computer Graphics (Proc. Siggraph ’91), fig. 7, p. 202. 



























FIGURI 18.59 

Results of BF refinement. The images on the left are the solutions corresponding to the 
refinements on the right. Reprinted, by permission, from Hanrahan et al. in Computer 
Graphics (Proc . Siggraph *91), fig. 8, p. 205. 




















FIGURI 18.60 

(a) The radiosity solution, (b) The importance solution, (c) The sum of radiosity and 
importance. Reprinted, by permission, from Smits et al. in Computer Graphics (Proc . 
Siggraph ’92), figs. 1-3, pp. 274-275. 













An importance-driven radiosity solution. These are different images of the same solution, (a) 
A close-up of the patches generated, (b) A smoothly reconstructed version of (a), (c) The 
solution for (a) but seen from farther back, (d) The importance solution for (a) from farther 
back, (e) The solution for the whole environment for (a), (f) The importance of the whole 
environment for (a). Reprinted, by permission, from Smits et al. in Computer Graphics (Proc. 
Siggraph *92), figs. 6-11, p. 280. 



































FIOURI IS.69 

Discontinuity meshing and regular subdivision. Reprinted, by permission, from 
Lischinski et al. in Computer Graphics (Proc. Siggraph ’93), fig. 4, p. 203. 



FIOURI IS.70 

(a) A wall produced with discontinuity meshing, (b) A wall produced with standard meshing. 
Reprinted, by permission, from Lischinski et al. in IEEE Computer Graphics & Applications, 
fig. 14, p. 37. (©1992 IEEE.) 




































FIGURB It.71 

(a) Discontinuity meshing, (b) Hierarchical radiosity. Reprinted, by permission, from 
Lischinski et al. in Computer Graphics (Proc. Siggraph '93), fig. 9, p. 207. 



FIGURI 18.73 

Light scattering in a smoky room at sunset. Reprinted, by permission, from Rushmeier 
and Torrance in Computer Graphics (Proc. Siggraph '87), fig. 12c, p. 302. 
















FIGURI 19.20 

An image produced by classical ray tracing. Reprinted, by permission, 
from Whitted in Communications of the ACM, fig. 7, p. 347. 



An image produced by distribution ray tracing. (“1984” by Tom Porter ; 
based on research by Rob Cook . ©1984, Lucasfilm , Ltd.) 






FIGURI 19.44 

(a) A scene to be rendered by photon tracing, (b) The distribution of photon hits on the wall. 
(Courtesy of Sumanta Pattanaik.) 



FIGURI 19.47 

An image created by a three-pass 
algorithm. Note the caustic formed 
by the lens. (Courtesy of Paul 
Heckbert.) 








FIOURI 1 9.48 

Hybrid rendering, (a) Direct illumination 
only, (b) The radiosity solution, (c) The full 
solution. Reprinted, by permission, from 
Wallace et al. in Computer Graphics (Proc. 
Siggraph *87), fig. 10(a)-(c), p. 319. 



FIOURI 19.49 

An image rendered using a 
three-pass hybrid method. 

(Courtesy of Peter Shirley.) 












(a) Direct illumination and caustics from radiosity. (b) Interreflections from 
radiosity. (c) Direct illumination from visibility ray tracing, (d) Caustics 
from photon ray tracing, (e) Interreflections from visibility ray tracing. 

(f) The final image. Reprinted, by permission, from Chen et al. in Computer 
Graphics (Proc. Siggraph ’91), figs. 3c and 5a-e, pp. 171-173. 





FIOURI 19.83 

A volume model rendered with ray tracing. 
Reprinted, by permission, from Levoy in 
ACM Transactions on Graphics , fig. 8, p. 257. 


FIOURI 19.34 

A volume model including atmospheric 
media. (Courtesy of Masa Inakage.) 



FIOURI 30.3 

A room lit by a light source of 1,000, 10, .1, .001, and .00001 lamberts (reading from 
left to right, top to bottom). The lower left is an unprocessed reference image. 
(Courtesy of Jack Tumblin.) 










FlOUftl 20.4 

Color images processed for adaptation, (a) A cabin by day. (b) A cabin by night. 
(Courtesy of Greg Ward.) 



FIOURI ao.s 

Images corrected by the Ward scale factor, (a) A cabin by day. (b) A cabin by night. 
(Courtesy of Greg Ward.) 









FIOURI 20.4 

Color images processed for adaptation, (a) A cabin by day. (b) A cabin by night. 
(Courtesy of Greg Ward.) 




FIOURI 20.8 

(a) The blurred image, (b) The scaling function, (c) The clamped scaling function, 
(d) The smoothed clamped scaling function. (Courtesy of Kenneth Chiu.) 




FIOURI 20.7 

An image filtered for scaling and blooming. (Courtesy of Kenneth Chiu.) 







FIOURI ao.t 

Images corrected by the Ward scale factor, (a) A cabin by day. (b) A cabin by night. 
(Courtesy of Greg Ward.) 






(a) 



PIOURI 20.9 

Images corrected by the Ward scale factor, with the adaptation level set at the window, 
(a) A cabin by day. (b) A cabin by night. (Courtesy of Greg Ward.) 









HOUR! 20.1 1 

A conference room with optimized lighting, (a) An 
impression of visual clarity, (b) An impression of privacy. 
Reprinted, by permission, from Kawai et al. in Computer 
Graphics (Proc. Siggraph *93 ), fig. 6, p. 154. 
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The factor 1 jAi is convenient because it means the form factor satisfies a symmetrical 
reciprocity rule : 

F kA A k = F iyk Ai (18.22) 

which comes straight from the definition. This rule is very useful in computations. 
The form factor is also called the radiation factor , the angle factor , and the configu¬ 
ration factor . 

Writing Equation 18.20 in terms of the form factor gives 

n 

BiAi = EiAi + Pi J2 B k AiF itk (18.23) 

k= 1 

Using the reciprocity of form factors in Equation 18.22, we note that we can re¬ 
express this equation as 


BiAi = EiAi + Pi J2 B k A k F k ,i (18.24) 

k= 1 

Equation 18.24 is the key to developing an intuition for the classical radiosity 
method. On the left appears BiAi, which is the product of the power per unit 
area leaving patch i times the area of patch i ; thus it’s the total power radiated by 
patch i into the universe. This total power is the sum of two terms: the power emitted 
directly by the patch itself, and the power propagated by the patch by reflection. 

Writing out the power explicitly, 

n 

$i° = $i e + Pi'52$ k F k ,i (18.25) 

k=l 

we can see at a glance that the form factor describes how much of the energy 
radiated by patch Mk gets to M*. 

The first term on the right of Equation 18.24 is the emitted power per unit area 
on patch i (e.g., due to incandescent or blackbody processes), times the area of patch 
i y so this is the total power generated by the patch and sent into the environment. 

The right-hand term tells us to look around at every patch k in the environment. 
We find the power emitted by that patch from B k A k . The form factor F k tells us 
how much of this power reaches patch i. The power received from each patch k in 
the scene is accumulated, and then scaled by the reflectivity pi of the patch. This 
reflected power is added to the emitted power to represent the total outgoing power. 

The value of Equation 18.24 is that it combines both the self-emitted power and 
the reflected power from every patch to express the total outgoing power. This is 
the power used by the other n - 1 patches. In other words, Equation 18.24 specifies 
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BiAi y and there are n — 1 other equations giving the power of all the other patches. 
By solving all these equations simultaneously, we get a set of consistent values for Bi 
that represent the radiosity of every patch in a stable environment. 

Equation 18.24 can be made a bit more efficient for practical calculation. As 
before, we first use the reciprocity of form factors to write AiF iy k = 


BiA{ — E{A{ 4- pi BkAiFi^k (18.26) 

k= 1 


and then we divide through by A{i 


B i = E i + p i Y J B k F i , k (18.27) 

k— 1 

Equation 18.27 is the classical radiosity equation . 

Equation 18.27 expresses a set of n simultaneous equations for the radiosity B for 
each of the n patches in the scene. We get the classical radiosity system of equations 


(I - F)B = E 


(18.28) 


or in tableau form, 

1 — Pi ^1,1 P\F 1,2 

£2^2,1 1 ~ PiF \,2 

PnF r^i ' " 


-plFi yn 


1 PnF n y n 



’ Bi ' 


' E x ‘ 


b 2 

. 

e 2 


b 3 


e 3 




Once the matrix has been built, it may be inverted to find the solution 
B = (I — F) -1 E. 

The matrix elements in Equation 18.28 are much easier to construct than those 
in the full collocation or Galerkin cases; it only depends on computing the form 
factors. 

We can demonstrate the classical radiosity technique on a simple example. To 
make life easy, we will choose a geometric situation where analytic form factors are 
known and not too complicated. Figure 18.2 shows an infinite shelf made up of 
three flat pieces: a flat bottom and two walls. The width of the floor B of the shelf 
is 6, and the walls A and C each have height a above them. Defining g = 6/a, from 
Appendix D, we can write the six form factors relating each pair of surfaces. Because 









MOURI IS.2 

An infinite shelf. 


of symmetry, we need only the three form factors 

F a ,b = ^ (i + 9 ~ \/l + <7 2 ) 

= y/'i'+g* ~ 9 

Fb,c = ^ (l + (l/$) ~ a/ 1 + U/s) 2 ) 


The others are 


Fc,s = Fc,a ~ Fa,c F b ,a = Tb,c 


The form factor matrix K is 


K = 


1 ~PaFa,b ~PaFa,c 
-PbFb^a 1 ~PbFb,c 
-PcF c ,a ~PcFc,b 1 


(18.30) 


(18.31) 


(18.32) 


Figure 18.3 shows four example configurations of the shelf, with results tabulated 
in Table 18.1. In the first example we turn on emission from wall A only, set its 
reflectivity to 0, and let the light bounce around. Then we assign a bit of reflectivity 
to wall A; note that all the radiances go up. This is because of light that leaves A, 
bounces off of B or (7, and then reflects off A to strike B or C again. A bit more 
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(a) 



FIOURI 18.3 

(a) Wall A radiating, B and C reflecting, (b) Wall A radiating, all walls reflecting, (c) Wall B 
radiating, B and C reflecting, (d) Walls A and B radiating, all walls reflecting. 
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T AS LI 1S.1 

Results for the infinite shelf environment. 


light therefore comes off of A , and also contributes to B and C. In the third case 
the only emitter is the floor B ; because of symmetry we expect the radiosity on the 
walls to be equal, and the data shows that it is. Finally, we turn on both the floor 
and the wall, with a bit of reflectivity for both, and we see that the illumination on 
the right-hand wall goes up. 


18*2.4 Highsr-Ordsr Radiosity 

We call any radiosity system that uses basis functions other than the constant func¬ 
tions used by classical radiosity an example of high-order radiosity . Such systems are 
appealing because they can relax one or more of the assumptions made by classical 
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radiosity. In particular, the use of nonconstant basis functions means that the envi¬ 
ronment can be rendered to a given degree of accuracy without requiring the level of 
fine subdivision required by classical radiosity. This is because higher-order methods 
are able to describe changing radiosity over a surface, while classical methods rely 
on subdivision of that surface into constant elements small enough to represent that 
variation. 

A comparative study of collocation and Galerkin methods in 3D has been re¬ 
ported by Troutman and Max [440]. They found that in their implementation the 
extra cost of the Galerkin method was not justified by its performance with respect 
to collocation using a discontinuity-based meshing of the environment (discussed 
below). 

Similar results have been reported by Zatz [503] based on his implementation 
of Galerkin radiosity; he noted that although the Galerkin method required much 
less memory than collocation, both techniques required about the same amount 
of time to create an image of about the same quality with respect to a reference 
image. However, even if the image error metrics are similar, the smooth gradation of 
radiosity across a curved surface manages to avoid Mach banding and other artifacts 
that accompany finely meshed radiosity scenes. Figure 18.4 (color plate) shows an 
interior room containing only 607 surfaces; note the smooth distribution of light in 
the scene. 

One problem with higher-order methods is that they do not easily accommodate 
sharp shadows (low-order methods have a problem with this, too, but for different 
reasons). The difficulty is that a single set of basis functions needs to represent 
the light distribution across a surface, and if that distribution changes suddenly, 
then it requires some very localized high-frequency changes; these can be difficult to 
generate with smooth basis functions. Similarly, when a patch P p sits between two 
patches Pi and P&, we need to determine how much of the light from Pi reaches Pk 
based on the light description over Pi and the geometry of the three patches; this 
can be difficult. Zatz has developed a shadow-mask technique which accounts for 
partial occlusion, but the method is difficult to control automatically, and scenes 
including such masks will not converge to a correct solution [503]. 

Another nonconstant set of basis functions for radiosity are the wavelet bases 
[384]. These are closely related to the hierarchical radiosity technique discussed 
below. 


18.3 Solving the Matrix Iquatien 

In Section 18.2 we wrote matrix equations that defined a set of simultaneous condi¬ 
tions on the radiosity B within an environment. These equations were generally of 
the form 


KB = E 


(18.33) 
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for a set of emissivities E and a matrix K built from form factors and basis functions. 
Although formally this equation can be solved for a nonsingular K by writing B = 
K _1 E, this is often impractical. 

The problem is twofold: the sheer size of K presents storage and inversion 
problems, and the expense of computing each element is usually considerable, so we 
would prefer not to compute elements that we don’t need. 

Consider size first. The matrix K is square, containing n elements on a side when 
there are n patches in the environment. If we double the number of patches, the 
size of K quadruples. The size leads directly to storage problems: we need to put 
all of these elements somewhere, and when we access them we can begin to find 
that practical issues like page faults become serious issues. Related to the size is the 
difficulty of the inversion; Gaussian techniques are 0(n 3 ) in the size of the matrix 
side n [348]. This can become prohibitively expensive in modern scenes, which can 
contain tens of thousands of patches. 

The other problem with a large matrix is that computation of each element Ki^ 
can be very expensive. Even in the classic radiosity case where we use box functions, 
we still must calculate the form factor F^ between each pair of patches i and k. We 
will see a number of ways of doing this in Section 18.5, but it is generally not an 
inexpensive operation. 

So, rather than explicitly invert the matrix and multiply it with the emission 
terms, a number of iterative methods have been developed that take an initial guess 
at the solution vector B and then slowly refine that guess until it contains less than 
some prescribed amount of error. 

The particular form of iterative techniques that are used for solving radiosity 
problems are called relaxation methods . There are many types of relaxation methods 
described in the numerical methods literature, and efficient and stable programs for 
implementing them are widely available [348]. In this section we will review four 
methods that have proven particularly useful for radiosity: Jacobi iteration , Gauss - 
Seidel iteration , Southwell iteration , and overrelaxation . 

We will first describe these for a general matrix equation, and then discuss the 
physical interpretation of the operations in the context of radiosity solutions. Our 
presentation of these methods follows Gortler and Cohen [166]. 

We suppose that we are given the linear system 


Kx = b (18.34) 

where K is an n x n matrix, and x and b are n-element column vectors. We are 
given K and b, and we wish to find x. An element of x is written Xi for i G [1, n], 
and an element of K is written K^k for i, k G [1, n]. 

We will generate a series of approximate solutions to x that are intended to 
converge to the real solution (we will assume that such a solution always exists). The 
approximation after step g is written x^; the parentheses around the superscript are 
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intended to remind us that this doesn’t represent an exponent. As in Equation 16.19, 
we define the error in the gth approximation by 


e (g) = x _ x (s) 

(18.35) 

and the residual in the #th approximation by 


ll> 

cr 

l 

X 

S 

(18.36) 

And as we saw in Equation 16.20, 


r (9) = Ke (9) 

(18.37) 


Relaxation methods use the residual to refine the approximation x^ and generate 
its successor x^ +1 ). The general plan is to look at some element of the residual 
vector and apply some transformation to the corresponding element so that 
goes to zero. This will probably cause the other elements of r to change, and 
perhaps increase, but the hope is that the general trend is toward smaller values for 
all elements of the residual vector. 

Let’s find out what we need to do to so that the next generation’s corre¬ 
sponding residual will be zero. We begin by writing out the matrix equation 

for row i: 


Y'Kij'ZkW =bi 


k=l 


K ltl x^ + K ia x 2 (9) + • • • + KijZiW + ■ • • + K^ n Xn. (9) = bi 


(18.38) 


Since we want to change x^ 9 \ we can move everything but K ia x t < - 9 ' > to the right side 
of this equation 


Ki, iXi ^ = bi - £ K iM x k ^ 

and then divide through by K iy i\ 


(18.39) 


k= 1 
k*i 


K~X^ = _ V — 

k^i 
r .(g) 




= H^ + 


Ki,i 


(18.40) 


We call this last quantity Ax^ 9 K 
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for i <—jQtto n 

X{ 4 — 0 

endfor 

Initialize the first guess. 

while not converged 

Update the unknown vector. 

r< 9 > = b - Kx< 9 > 


for i 4 — jQ^to n 


Xj (9+1) <- Xi^) + r\ 9) /K iti 

Update each element. 

endfor 


endwhile 


MGURI 18.5 

Jacobi iteration. 


Adjusting an element so that its residual goes to zero is called relaxing the element. 
The iteration continues until the convergence criteria are met. Typical criteria are 
that the magnitude of every element of the residual must be less than a threshold, 
and that the change in an unknown element be less than a threshold: 


max(|r|) < t 


r .(s) _ T .(s+i) 

%JU 1 JU 1 


< t 


(18.41) 


18.3* 1 Jacobi Iteration 

The Jacobi iteration method is a straightforward application of the machinery of 
the previous section. We first create an initial guess of all zeros, and then enter a 
loop. First we test for convergence; if the error in the solution is low enough, we 
exit. Otherwise we compute the residual vector r^. We now step through each of 
the n elements of x and add the correction factor required to bring its residual to 
zero. When we’re done, we return to the top and test for convergence. The Jacobi 
algorithm is summarized in Figure 18.5. 


18.3.3 Gouss-Soidol Iteration 

The Gauss-Seidel iteration method is just like the Jacobi method except for a small 
change. Recall that the Jacobi loop begins with the calculation of the residual from 
the current x^ 9 \ and then the next generation’s elements are computed from that 
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for i <-iLto n 
i — 0 

endfor 

Initialize the first guess. 

while not converged 

Update the unknown vector. 

for i *-JLto n 


Xi 4— ( bi — x kKi,k)/Ki,i 

Update each element. 

endfor 


endwhile 


FIGUR8 18.6 

Gauss-Seidel iteration. 


information. This means that we don’t actually use the new values in until 

they have all been created, and we use them to create the new residual r^ +1) . The 
Gauss-Seidel method simply updates the elements in place, and calculates the residual 
anew for each element. This means that when we’re updating the X 3 ^, we use the 
values and X 2 ^ in the calculation. Rather than explicitly recalculating the 
entire residual, we use the immediate form in Equation 18.40. The Gauss-Seidel 
algorithm is summarized in Figure 18.6. 


18.3.3 Southwell Iteration 

The method of Southwell iteration adds another wrinkle to the basic Jacobi algo¬ 
rithm. Like Gauss-Seidel, it uses the most recently computed unknowns to update 
each element. Notice that Gauss-Seidel always updates each element of the unknown 
in turn. So if the residual is large for only one element of the unknown and small 
for the rest, we will only get to process the element with large error every nth step. 
The Southwell method doesn’t bother looping through the elements of the unknown 
in order, but uses a greedy heuristic to relax the element with the residual of largest 
magnitude first. Then if we’re not converged, it again goes after the largest residual. 
This means that the same element can be repeatedly adjusted to the exclusion of all 
others if it’s far more out of range. 

Since we always want to relax the residual element with the greatest magnitude, 
we need to make sure that after each adjustment we update the residual vector. This 
looks expensive, because the residual depends on every unknown. Happily, the new 
residual can be computed efficiently. 

To see how to compute this new residual, start by observing that the change in 
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the unknown vector x from one step to the next may be written 


X (s+D = x («) + Ax (9) 


(18.42) 


The new residual is then 


r (9+l) = b - Kx (9+1) 

= b-K(x (5) + Ax (9) ) 

= b - Kx (9) + KAx< 9 > 

= r<»>+KAx<*> 0M3) 


Now because we only update one element of the unknown at a time, Ax p is all zero 
except for component i, which is /K iyi . Then we can update the residual vector 
by removing just the amount due to element i from each element k: 


r k ^ 9 ^ = r^ 9) - 


(18.44) 


To get the ball rolling we need an initial residual vector r^. If as before we use 
an initial unknown guess of x = 0, then 


r (0) = b - Kx<°> = b - OK = b (18.45) 


The Southwell algorithm is summarized in Figure 18.7. 


18.3.4 OvMTelaxation 

The idea of overrelaxation can be used with any of the methods described above. The 
idea is that instead of subtracting out just the necessary amount from an element to 
set its residual to zero, we anticipate the need to subtract more later on and subtract 
it now. This is an aggressive strategy; the degree to which we anticipate the future is 
determined by a factor u; t for element i of the unknown. So during any update step: 

#^ +1 ) = (18.46) 

we instead use 

Xi ( * +1) = X M + (18.47) 

The ith residual is no longer zero, but 

r j to +1 ) = (l-.a; i )r i W (18.48) 
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for i <-JLto n 


, Initialize the first guess and residual, 

ni-bi 

endfor 


while not converged 

Improve estimate . 

select i so that r* = max(r) 

X{ 4 X{ Tif Kij 

Update one element. 

t «— Ti 

Get the residual just relaxed. 

for k <-itto n 


r k f'k ~ 

endfor 

Update the residual vector. 

endwhile 


FIGURI IS.7 

Southwell iteration. 


If Ui > 1, the technique is called overrelaxation , while if 0 < < 1, the technique 

is called underrelaxation . Underrelaxation can be useful for unstable systems. 


18*4 Solving Rndloslty Matrices 

We’ll now consider each of the matrix solution methods described above in turn as 
a method for solving radiosity problems. We will look at the physical interpretation 
of the mathematics in terms of energy transfer in an environment. We will not focus 
on issues of convergence and stability. An analysis of convergence may be found in 
Gortler and Cohen [166], where it is shown that these methods will indeed converge 
for radiosity problems. Discussions of stability may be found in numerical methods 
books such as Press et al. [348] and Ralston and Rabinowitz [353]. 

For the purpose of illustration, we will take a matrix that corresponds to the 
classical radiosity tableau of Equation 18.29. That is, in the equation 


KB = E (18.49) 

the n-element vectors B and E correspond to the radiosities and emittances of the n 
patches, and the n x n matrix K contains reflectivities and form factors: 


Ki,k — $ik PiFi ,k 


(18.50) 
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We will further assume that all the patches are convex, so the form factor of any 
patch to itself is zero: F iyi = 0. 

This matrix is illustrated in schematic form in Figure 18.8 for a simple scene. 
Here we have only shown the direction of transfer implied by the form factors; the 
magnitude of the form factors and the coefficient pi aren’t shown. 

In terms of radiosity, the patch emittances usually form the first guess for the 
patch radiosities. The residual measures the difference between the emittance and 
the reflected radiosity; that is, the radiosity that hasn’t yet been distributed into the 
environment. A common metaphor is to think of each patch as having two bins 
in which radiosity is accumulated. In one bin we have the radiosity of the patch 
itself; this is the power per unit area we would see if we looked at the patch at that 
moment. We can think of this as the energy the patch is shooting into space. In the 
other bin is the undistributed radiosity; this is some additional energy per unit area 
that the patch should be distributing into the environment, but we haven’t yet gotten 
around to taking care of computationally; this is also called unshot radiosity. 

So the residual tells us how much more energy the patch should be distributing 
into the environment than it already is; by increasing the patch’s radiosity, we drive 
down the residual, and decrease the amount of energy in the undistributed bin. 

Many practical algorithms make a time-space trade-off and compute matrix el¬ 
ements only when they are needed. In the case of the simple radiosity system this 
means that form factors are computed on the fly when a pair of patches exchanges 
power. We say these elements are computed on demand , dynamically , or lazily . 
Elements built by lazy evaluation may be cached for a fixed or indefinite period of 
time in case they are needed again, or disposed of to save on storage. 


18.4.1 Jacobi iteration 

In Jacobi iteration we update all the elements of the unknown vector at once. In 
terms of radiosity, this means that the radiosity of every patch is incremented to 
represent the undistributed energy. 

This method is not widely used because of its great expense. Typically a small 
number of patches account for most of the radiosity (at least at the beginning of a 
simulation where we have a dark room and a few luminaires), and it’s wasteful to 
update all the patches at every step when they don’t contribute much to the image 
or the distribution of light in the scene. 


18.4.2 OauM-Seldcl Iteration 

Gauss-Seidel iteration updates the entire solution one step at a time, but uses the new 
values as they are computed to increase efficiency. In terms of the classical radiosity 



HOUR! 11.1 

(a) A scene of four walls, (b) A graphical representation of the form factor matrix for (a). 
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matrix, the Gauss-Seidel step for patch i takes the form 


Bi — E{ 4- Pi ^ ^ Fi } kBk 


k =1 
k^i 


(18.51) 


To see the physical interpretation of this step, we will multiply through by A { the 
area of patch i : 

n 

BtAi = EiAi + P iY, B k AiF itk (18.52) 

fc=l 

k^i 

and then use the reciprocity relationship of form factors in Equation 18.22 to get the 
expression in terms of Fky. 


n 

BiAi = EiAj + pi BkAkFk y i (18.53) 

k=l 

k^i 

We can interpret Equation 18.53 in physical terms. On the left is BiA *, the 
power coming out of patch i into the environment. This power is the sum of the 
emitted power EiAi and the reflected power gathered from all other patches in the 
environment. The key here is the loop over all the patches: it visits each patch 
fc, gathers the power BkAk , and then finds the fraction Fk,i of that power directly 
transferred from patch k to patch i. This process is illustrated in Figure 18.9, 
where we have shown the power transfers involved in updating one patch, and those 
elements of the matrix involved using the same conventions as Figure 18.8(b). 

Notice that what we’re doing here is finding the dot product of a vector of 
radiosities with a column of the matrix (though in the original computational form 
of Equation 18.51 it’s a row of the matrix). 

Figure 18.10 (color plate) shows an interior scene after different numbers of steps 
of the Gauss-Seidel algorithm. 


18.4.3 Southwell Iteration 

In Southwell iteration we look for the element with the largest residual and relax 
it. If the solution isn’t converged, we repeat the process. In terms of radiosity, this 
means we’re finding the patch with the largest undistributed radiosity and sending 
that into the environment. 

In other words, we look for the patch that has the most radiosity that has not 
yet been accounted for, and we relax that patch by sending this radiosity into the 
environment. This is called shooting the power, as distinguished from the gather¬ 
ing performed by the Gauss-Seidel algorithm. The process begins by selecting the 
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brightest light source in the environment, and distributing the radiosity from that 
light to all the other surfaces. The next patch chosen might be another light source, 
or it might be a surface patch if that piece of surface received a lot of energy and had 
a high reflectivity coefficient. 

A step of the Southwell algorithm is shown in Figure 18.11. 

Figure 18.12 (color plate) shows an interior scene after different numbers of steps 
of the Southwell algorithm. 


18.4.4 Progressive Refinement 

The method of progressive refinement , introduced by Cohen et al. [95], uses a variant 
on Southwell iteration to produce a useful image at each step of the solution process 
[99]. This is desirable because it allows a designer to see estimates of the final 
simulation as the computation proceeds. 

There are a few important changes introduced in this algorithm beyond straight 
Southwell iteration. Calling A Bi the unshot radiosity at patch i, progressive refine¬ 
ment (PR) selects the next patch to shoot by finding the one with the largest unshot 
power AiABi , not just the largest unshot radiosity Bi. 

When the Southwell refinement has satisfied the termination criteria, Cohen et al. 
add a final step of Jacobi iteration to simultaneously distribute the remaining unshot 
radiosities into the scene. 

Because the Southwell approach selects the brightest patch first, progressive ra¬ 
diosity will quickly shoot energy from the bright lights into the environment, and 
then gradually fill in the subtle details from repeated interreflections. The result of 
this process was shown in Figure 18.12. Note that it is easier to see the details in the 
picture and get an overall impression of the final image earlier in the process than 
with the Gauss-Seidel rule in Figure 18.10. However, in the early iterations, much 
of the image is still dark. Although Southwell relaxation will eventually find a con¬ 
verged solution, for practical use we would like the intermediate images (particularly 
the first few) to be closer to the final result. 

Cohen et al. have suggested a number of heuristics to improve the appearance of 
the intermediate images. Note that this is effectively postprocessing the solution at 
each step for display; we are not changing the solution process, just how the results 
are presented after each step. The idea is based on the observation that we can 
quantify A£, the average unshot radiosity in the scene, by simply adding up all the 
unshot power and dividing by the total area: 


A B = 


£ t n = iA£Ui 
£ILi Ai 


(18.54) 
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(a) A Southwell shooting step for surface 2. (b) The row of matrix elements involved in the transfer. 
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By the same reasoning, we can find the average reflectivity p from 

- _ £”=i &Pi A i 

p sr., 




Now consider what happens when the unshot radiosity is released into the scene. The 
initial release of the unshot radiosity adds A B to the scene. After one reflection off 
the surfaces, pAB is reflected back into the environment, where it is reflected again, 
sending out p 2 A£, and so on, again and again. The result of all this reflection is B a > 
an ambient term that estimates the total unshot radiosity after reflecting around the 
environment: 

B a = AB(1 4- p + p 2 4- • • •) = Ai?-—— (18.56) 

For the purposes of intermediate images only, each patch i may be displayed with 
radiosity B{ + piB a . __ 

As the radiosity estimate improves, the amount of unshot radiosity A B drops, 
reducing the amount of ambient light added to the image. Figure 18.13 (color plate) 
shows an interior scene after several steps of the progressive refinement algorithm 
including ambient display. Notice how much better the early pictures appear using 
this estimate of the ambient light. 

Another important practical aspect of the PR approach is that it does not compute 
and store the entire matrix of form factors before processing. Rather, each time a 
patch i is selected for shooting, all the form factors F*,* to the environment are 
computed dynamically, used for a single Southwell step, and then forgotten. This 
means that the same form factors will likely be created over and over again during 
a single simulation. This is unfortunate, but when the environment is very large it 
becomes impractical to store the form factor matrix. In this type of application, an 
efficient means for computing the form factors is imperative. An example of the PR 
algorithm in a complex environment is shown in Figure 18.14 (color plate), which 
contained 2,000 patches (the PR pass was followed by a second pass in which the 
PR solution was processed to fit a finer mesh). 


18*4*5 Ovorrolaxation 

As discussed earlier, overrelaxation may be added to any of the solution methods 
by scaling up the correction term at each stage. The central question is how much 
overshooting should be performed at each step. 

In an algorithm presented by Feda and Purgathofer [141], the adjusted radiosity 
A B[ to shoot is computed as the minimum of two candidates: one is the estimated 
radiosity produced by the PR algorithm, and the other is this patch’s area-weighted 
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share of the average unshot radiosity in the scene. 

n 

AB- = min(ABi + piB a , ^ B k A k /Ai) (1MZ) 

k= 1 


In fact, the choice of the next patch to be shot in a Southwell-type relaxation algo¬ 
rithm is made based on the overestimated energy A{AB '. 

Gortler and Cohen [166] have developed an overshooting algorithm that solves a 
restricted subproblem in the radiosity model. They select a single patch i, and solve 
for the interaction of this patch with the entire environment, including the reflection 
of energy back onto the shooting patch. 

They report good results in their tests using a relaxation factor of 1.2. 


18.4.6 Comparison 

All of the algorithms discussed above, plus some variants, were compared by Gortler 
and Cohen [166]. They constructed a number of test cases, including simple scenes 
containing cubes, an office environment, and a random matrix with the same general 
structure as a radiosity matrix (that is, diagonally dominant). 

The results are shown in Figure 18.15 for the office environment, and two random 
matrices with a few emitters (representing light sources) in a dim (low-reflectivity) 
and bright (high-reflectivity) environment. The error at step g was measured by 


E {9) = 


ZUiBj ~ BW) 

£"=i (Bl-Ei) 


(18.58) 


where B\ is the reference (or correct) value for patch i . Notice that this measure sim¬ 
ply estimates accuracy, and not computational cost, speed, or storage requirements. 

The algorithms are keyed in the figure by these codes: 

GSO (Gauss-Seidel iteration). The initial guess is 0, and the patches are refined in 
order. 

GSJ (Gauss-Seidel + Jacobi iteration). Like GSO, but the result at each step is the 
radiosity of each patch plus the unshot radiosity. 

S (Southwell iteration). Like GSO, except that the patches are not relaxed in order, 
but rather the patch with the largest unshot energy is selected at each step. 

SJ (Southwell + Jacobi iteration). Similar to S, except that the result at each step is 
the patch radiosity plus its unshot radiosity. 



u 


. 1 s 


Performance of radiosity algorithms, (a) An office scene, (b) A random dim matrix with few 
emitters, (c) A random bright matrix with few emitters. Redrawn from Gortler, Cohen, and 
Susaliek, figs. 3, 6, 7, p. 56. 
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GSE (Gauss-Seidel-E). Like GSO, but the initial guess is set to the emittances E rather 
than 0. 

Ovr (Overshooting). This is a shooting scheme described by Gortler and Cohen 
[166], where the energy of each patch is shot into the environment; reflections 
back onto that patch are immediately accounted for using an overshooting 
procedure. 

As we might expect, the full-fledged Gauss-Seidel iteration (GSO) performed the 
least well of all methods, because it requires an almost complete pass through the 
matrix, requiring n steps, before it has visited all the major radiators in the scene. 
Gauss-Seidel + Jacobi, Gauss-Seidel-E, and Southwell iteration all had roughly the 
same performance. Simply ordering the computation by selecting the largest unshot 
energy does not significantly reduce the error. The Southwell + Jacobi method used 
by progressive radiosity performed very well. Recall that this selects the shooting 
patches in order by unshot energy, then produces an image that is adjusted to contain 
an estimate of the unshot radiosity. This produces a better estimate in the early stages 
of the computation, but the advantages are reduced as the solution converges. The 
overshooting algorithm performed slightly better than progressive radiosity. 


18.5 Form Factors 

Recall that the form factor F{^ specifies the fraction of energy transferred from patch 
i to patch k. Because of its intimate link to the propagation of energy throughout an 
environment, the form factor plays an important role in image synthesis. Unfortu¬ 
nately, the definition of Equation 18.21 cannot usually be analytically integrated as 
given; we need to either change the definition or compute an approximation. 

Form factors are at the heart of any radiosity method. For this reason it is 
important that we understand how they are defined and used. One of the best 
ways to develop this understanding is to look at the various methods that have been 
developed to compute form factors. This section does not present a complete survey 
of the vast form-factor methods. Rather, I have tried to summarize the most useful 
methods and ideas and point the way to the rest of the references. 


We begin by presenting the three basic form factor expressions, which link a pair of 
differential areas, a pair of finite areas, and a differential and finite area [406]. 

We start with the form factor linking two differential elements dAi and dAk- 
From the definition of radiance in Equation 13.12, the power $ leaving dAi and 
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The geometry for the form factor from dAi to dAk. 


arriving on dAk is given by 


^i k = Li cos 6i dAi duk 

t D , A dA k cos6 k /iQ cq\ 

= Li cos Qi dAi -^- (18.59) 

r z 

using the geometry shown in Figure 18.16. Now if is a purely diffuse emitter and 
sends its energy equally in all directions, then from Equation 13.59 its total energy 
output into the hemisphere over it will be 

$i = nLi dAi (18.60) 


The ratio of the energy sent from i to k to the total energy released by i is then 




Li cos Qi dAi dAk cos 6k 1 

r 2 7r Li dAi 

cos Qi cos 6k . . 

-o- dA k 


(18.61) 


This ratio is the fraction of the energy emitted by i that arrives at k ; that is, it’s the 
form factor F dAi , dAk : 


F dAi , dAk 


cos 6i cos 6k 
nr 2 


dAk 


(18.62) 
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From inspection, we see that the form factor is completely symmetric except for the 
area factor dA k . This means that we can write the form factor for the power transfer 
in the opposite direction as 


cos Oi cos 6 k 

fdAk'dAi = -o- aA i 

7T r 1 

These two expressions satisfy the reciprocity relation 

dAiF dAi 4A k = dA k F dAk ,dAi 


(18.63) 


(18.64) 


Suppose we now enlarge dA k so that it becomes a finite element A k . Then we 
can integrate the form factor over all points on A k9 but we must be careful: if the 
point being examined at any moment on A k is not visible to A i9 then it contributes 
nothing. Recalling the visibility function V(r, p) from Equation 18.3, we write it 
here as V(i y fc), indicating the points on the two patches. Then we can integrate over 
A k to find 

f cos6iCos6 k /10 , c . 

F dAt ,A k = - 2 - V{i,k)dA k (18.65) 

JA k 

This relationship satisfies the reciprocity relation 


dA{FdA r ,A k — A k FA k , d < 4j 


(18.66) 


Finally, we can enlarge A { until it is finitely sized as well. The result is a general¬ 
ization of Equation 18.65, except that we pick up a factor of 1/^4*. This is because 
we are measuring the transfer of energy from A,-, which for a constant radiance per 
unit area is proportional to area. That is, 


F A t ,A k 



7T Li 


cos 6i cos 6 k 


nr 


dA k dAi 


nLiAi 


(18.67) 


which boils down to 


F A,,A k 


iLL 


COS Oi cos 0 k 


nr 


V{i,k) dA k dAi 


(18.68) 


or, in a slightly more comprehensible form, 

F Ai ,A k = ±- \ F dA „ Ak V(i,k)dA t (18.69) 

JA t 

This finite-to-finite transfer satisfies the reciprocity relation 


AiF.\ x<Ak = AkF Ak , Ai 


(18.70) 
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Patches 

Form factor 

Reciprocity rule 

Differential 

to differential 

F dAi , dAk = C ° se 'T^ dA k 

* * nr 

dAiFdAi,dA k = dA k FdA ki dAi 

Differential 

to finite 

FdA it A k = / 
JA k 

cosmos 0* v(itk)dAk 
nr z 

dAiFdAi,A k = A k F Ak , d Ai 

Finite 

to finite 

Fa " a * = i j 

f 1 

Ai J A k 7rr 

AiFA if A k = A k F Ak ,Ai 


TAIL! 18.2 

Form factors and reciprocity rules. 


These three form factors and reciprocity rules are summarized in Table 18.2. 

In general, closed-form expressions for form factors are hard to come by. They 
can be carried out for some of the traditional simple geometries for integration 
(e.g., spheres, infinite planes, and infinite cylinders), but most practical shapes elude 
analytical integration. A catalog of some useful form factors collected from the 
literature is given in Appendix D. 

A remarkable exception to this rule is the closed-form relation between two 
arbitrary (but unoccluded) polygons, recently developed by Schroder and Hanrahan 
[385,386]. This is a complex result; details are given in Appendix D. 

The problem is not as bad in the restricted world of two dimensions. Analytic 
expressions for form factors between linear elements in the 2D world of Flatland 
have been developed by Heckbert [202]. 


1 • .5.2 Contour Integration 

The form factor integrals of Table 18.2 may be recast into another form that is 
sometimes more convenient to integrate. Sparrow has noted that Stokes’ theorem 
can be applied to the form factor integrals to change them from area-based integrals 
to contour-based integrals [416]. An important assumption of all contour-based 
methods is that the two objects are completely visible to one another; that is, there 
are no objects anywhere in the space between them. 

Suppose that we have an infinitesimal area dAi located at (xi, yi, Z{) with normal 
(Zi,rai,ni), and we wish to find its form factor with respect to a finite patch A *, 
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where (x k , y k ,z k ) represents any point on the contour C k of A*, and r is the distance 
from dAi to that point. Sparrow [416] showed that we can write this form factor as 

it 7 I i z k- Zi) dy k - ( y k - yi) dz k 

F dAi , Ak = - 2^2 - 

/ (x fc - x t ) dz k - ( z k - Zi) dx k 

+ m, fc k - ^ - 

+ n t j, c ( W . - . vO ( 18 . 7 1 ) 


For two finite elements A{ and A k with respective contours C» and C k , the result is 
rather simpler: 


F A ty A k 


27Ti4i / c, / c k 


In r dxi dx k -f In r dyi dy k -f In r dz{ dz k 


(18.72) 


where r is the distance between the points on each contour. 

Equation 18.72 formed the basis for the form factor calculations used in the 
original radiosity paper by Goral et al. [165]. More recently, this result has been 
combined by Sun et al. with the principle of linearity to precompute components 
of form factors and then construct new form factors on the fly from a table lookup 
[428]. 

This method has also been used by Nishita and Nakamae to calculate the illumi¬ 
nation due to an area light source [321]. For the transfer from a differential patch to 
a finite polygon, Equation 18.71 has a particularly simple geometric form. Suppose 
the polygon has n sides and vertices Vi, 14,, V n . Call Ti the triangle formed by 
dAi, Vi, with normal S i (the normal may be calculated from the cross 

product of two sides of the triangle). The geometry is shown in Figure 18.17. 

Define a* as the angle between Si and the plane of cL4*, and & as the angle of the 
triangle nearest to dA i% Then the form factor may be computed by 


1 

F dA „ Ak = — Xl&cosai 

t=l 


(18.73) 


If there are other polygons between dAi and A k , we can use an algorithm like 
the one in Atherton et al. [20] to clip them to the boundary of this pyramid and 
then to each other, so that only one polygon is intercepted by any ray from dAi into 
the pyramid defined by A k . The form factor for each of these polygons may then 
be computed as above and then subtracted from the total found for A k . If such 
a clipping algorithm is available, it may be easier to simply clip A k first, and then 
compute its proper form factor directly. 
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FIOURI 18.17 

The geometry for contour integration of a polygon. 


18.8.3 Physical Dtvicoi 

In the engineering literature, we can find descriptions of physical devices that have 
been built to compute form factors, either directly or indirectly. These devices are 
interesting in their own right, because they help us improve our intuitive feel for the 
geometry behind the definitions of form factors. 

One device begins with the idea of the Nusselt analogs originally described in 1928 
[289]. Nusselt observed that the differential-to-finite form factor FdAi,A k between a 
differential element dAi and an unoccluded finite patch Ak can be computed in the 
following way, as illustrated in Figure 18.18. We’re going to integrate over many 
small pieces dAk of Ak- For each piece, find the solid angle ddk = dAkCosOk/r 2 ; 
think of this as the projection of dAk onto a hemisphere of radius 1 above dAi. Now 
project that onto the tangent plane at dA which is found from duik cos 0,. Now the 
base of the hemisphere has area A = nr 2 = 7r, since the hemisphere has radius 1. 
Finding the ratio of the projected area to the area of the base of the hemisphere, we 
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The Nusselt analog. 


find 

dA k cos 6 k 1 

- t. -cos 6i - 

r A n 

and integrating this over all of Ak gives us 



cos Oi cos Ok 

7TT 2 


dAk 


(18.74) 


(18.75) 


which is just the same as the differential-to-finite form factor defined in Equa¬ 
tion 18.65. 

The Nusselt analog was employed by Eckert in 1935 to make a form factor 
computing device, pictured schematically in Figure 18.19 [289]. A small light source 
was placed at the center of a hemisphere of frosted glass (Eckert used milk glass), 
and the (opaque) object to be measured was suspended inside the hemisphere in the 
proper orientation. The lights in the room were turned off, and a camera was placed 
far from the light source, along a line through the light and perpendicular to the base 
of the hemisphere. 
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The Eckert setup for measuring form factors. 


Because the only light was coming from the center of the hemisphere, the test 
object cast a shadow on the glass. The camera was far enough away that a picture 
of the hemisphere could be considered a parallel projection of the hemisphere onto 
a plane parallel to its base; that is, the image on the film corresponded to what you 
would get if you projected the hemisphere onto the base as in the Nusselt analog. 
Eckert measured the area of the shadow of the object and the area of the base of the 
hemisphere; the ratio of these two figures was the differential-to-finite form factor 
for that object. 

Another device for measuring form factors was built by Farrell in 1976 [140]. 
The purpose was to measure the form factors of objects on a drawing with respect 
to a luminaire; this could be used to help determine the illumination on the floor of 
a large open building. 

Pictured schematically in Figure 18.20, it consisted of a cylindrical light source 
inside a plastic tube onto which dots were painted. The spacing of the dots was such 
that when the lamp was directed downward onto the drawing, each dot represented 
a form factor of 0.001. The form factor of an object in the drawing (say the floor of 
a room) with respect to the luminaire could be found simply by counting the dots in 
the room and multiplying by 0.001. 

Farrell also provides pointers to other physical devices built to help measure form 
factors. 
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MOURI I 1.20 

The Farrell device for measuring form factors. 
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18*5.4 Projection 

As we saw in the last section on Nusselt’s analog, form factors between a differential 
and a finite patch have a lot in common with solid angles. We can imagine building 
up a library of solid angles T*, and precomputing the form factor FdAiXk associated 
with each solid angle k. For any given patch B , we can approximate its solid angle 
Tb by putting together pieces from the library. Because these solid angles don’t 
overlap, we can simply add together the form factor associated with each one: 

r B «riur 2 u---ur n 

FdAi,B « FdAiXi + FdAi,T2 “•-b FdAiXn (18.76) 

This is the basic idea behind projection methods. Each of these methods selects 
a projection surface , which is a surface for which efficient project algorithms are 
known. We’ll see that this is usually a hemisphere or plane. The surface is first 
subdivided into n disjoint (that is, nonoverlapping) cells, and placed over some 
imaginary differential surface. We then pretend that each cell is the solid angle 
occupied by an object, and compute the form factor for that object; this is the library 
of form factors mentioned above. 

To use the library to compute a form factor for a particular differential and finite 
patch, we place the surface over the differential patch and project the finite patch 
onto it. This determines the visibility of the patch, and tells us which solid angles 
to add to approximate the solid angle of the finite patch. For each occupied solid 
angle, we include the bit of form factor associated with that angle. The result is an 
approximate form factor for the cost of a projection step. 

The library of form factors are often called delta form factors , written AF, 
because to make up a form factor, we accumulate many of these library elements by 
adding them into a running sum. 

The algorithm may be more sophisticated by allowing the library to contain 
overlapping pieces; this can allow a better fit to the real solid angle at the expense of 
some extra bookkeeping when computing the form factor. 


Homicuboi 

The hemicube method developed by Cohen and Greenberg was the first projection 
method used for evaluating form factors in computer graphics [96]. The basic idea is 
to surround the differential patch with half a cube, as in Figure 18.21. One full face 
of the cube sits over the patch, parallel to its local tangent plane, and four half-faces 
surround it. The cube faces are tiled in a regular grid, and a delta form factor is 
precomputed for each grid cell and stored with that cell. 

The big practical benefit of the hemicube method comes from the wide availability 
of fast hardware for scan-converting polygons into pixels, which form a regular 
grid on a plane. Thus, the rendering hardware (usually Z-buffer based) in many 
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The hemicube sits over a differential patch. Redrawn from Cohen and Greenberg in Computer 
Graphics (Proc. Siggraph ’85), fig. 6, p. 35. 


modern graphics computers may be used to compute the visibility term from a 
differential area to all the polygons in the environment at once by simply rendering 
the environment onto the five walls of the hemicube. The only trick is to be able 
to identify the polygons from the final image, but this may be done easily by simply 
using a different, constant color for each polygon when rendering; the polygon 
number is then given by its color. Other methods, such as maintaining the object tag 
in a separate buffer, are also available on some systems. This approach takes care of 
occlusion automatically, since it’s a natural part of any scan-conversion Tenderer. 

Because it is placed over a point, the hemicube algorithm does not compute 
the finite-to-finite patch transfers that take place in a real environment. Rather, it 
simulates these transfers by placing the hemicube at the center of a finite patch and 
treating the incident radiosity as a constant over the patch. 

Although the hardware Z-buffer approach for scan conversion is particularly 
efficient, software approaches must be used when the hardware is not available. 
Vilaplana and Pueyo have noted that the visible image of a scene often doesn’t change 
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much when we move a small distance. This means that two nearby hemicubes will 
be very similar, so when evaluating visibility on a hemicube, we may use information 
from an existing neighbor to speed the process [454], 

Additionally, other visibility culling techniques may be used to accelerate the 
scan-conversion process in the visibility step [169,178,368,435]. 

The hemicube method is attractive because it is simple to understand and imple¬ 
ment, and when the right hardware is available, it is very efficient. But the approach 
has disadvantages as well. 

Baum et al. [33] have identified three major assumptions implicit in the hemicube 
method. Assuming that the hemicube is centered over a patch A*, the hemicube 
algorithm assumes 

Proximity: The distance between the patch A, and all other patches Ak is large 
compared with the size of A*. 

Visibility: The visibility of A* does not vary over the surface of A*. 

Aliasing: The periodic sampling pattern of cells on the hemicube faces is sufficient 
to obtain a high-quality estimate of the projection of Afc. 

When any one or more of these assumptions is violated, the accuracy of the method 
suffers and form factors become less accurate. 

The first assumption is violated by the condition in Figure 18.22, because the two 
patches are adjacent. Baum et al. calculated the analytic form factors for this pair 
of surfaces to be F*,i = 0.247 and F^k = 0.0494. Assuming that a hemicube with 
infinite resolution was placed in the center of each patch, the computed form factors 
would be F ki = 0.238 and F' ik = 0.00857. 

The values for Fk,i are relatively close because the distance from any interior 
point on Pk to a particular point on Pi is relatively close to a constant. But if we fix 
a point on Pk and roam over P t , the distance will vary quite a bit. In other words, 
the solid angle subtended by Pi from almost anywhere on Pk is roughly constant, 
but the solid angle occupied by Pk from points on Pi varies quite a bit, and we have 
seen that the form factor is closely related to the solid angle. When we calculate the 
analytic form factor, this change in the solid angle is accounted for, but when we 
use the hemicube, a single solid angle (taken from the center of the patch) is used 
to represent them all. Because the solid angle is not linear with distance, the large 
values up near the common edge and the small values out near the far end of Pi do 
not cancel out. 

The patches do not need to be at right angles to violate the proximity assumption; 
many other configurations will also fail. For example, consider a patch Pk that is 
almost coplanar to the patch Pi on which the hemicube sits, but slightly above the 
plane and tilted slightly inward. There is some small exchange of energy between 
these two patches, but the hemicube cell pointing toward Pk records either a full 
form factor related to the size of the cell or nothing at all. 
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MOURI 18.22 

A violation of the hemicube proximity assumption. 


The visibility assumption says that if a patch Pk is visible from the center of P iy 
then all of Pi is visible to that point on Pk- After all, the only visibility information 
weTe gathering is where the hemicube is located at the center of the patch, so we are 
assuming that information is true at all other points. This condition is easily violated 
by an occluding object between the two patches that does not happen to lie on the 
line from the hemicube center through the hemicube sample, as in Figure 18.23. 

Finally, the aliasing assumption is a natural result of the periodic, finite-resolution 
grid used by the hemicube as a sampling pattern. The hemicube can fall prey to all the 
aliasing problems discussed in Unit II, which can result in missed objects and incorrect 
form factors. Figure 18.24 shows simple examples of over- and underestimates of 
form factors because of the limited and periodic sampling resolution. 

One of the worst effects of the aliasing problem is that the distribution of light in 
the scene can be splotchy. A large patch in the foreground of an image may be small 
with respect to a distant but bright light source, and may be missed by the hemicube; 
that omission will surely be noticed. As with other periodic sampling methods, 
the hemicube distribution pattern may beat with the distribution of polygons in the 
environment. Figure 18.25 shows a linear mesh of polygons being illuminated by 
a patch Ai using a hemicube. Notice that only every other patch is illuminated, 
causing black-and-white stripes on the mesh that should be uniformly illuminated. 

Baum et al. suggest that when one of these three assumptions is violated, an 
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A violation of the hemicube visibility assumption. 


analytic routine (such as a contour-integral method) should be used instead of the 
hemicube for that form factor. 

Wallace et al. [461] note that the hemicube algorithm can only compute form 
factors to finite patches, but ultimately the radiosity calculation transfers radiosity 
to the vertices of the environment for display. This is why the hemicube method needs 
to average the polygon radiosities at each vertex when the system is converged. 


Other Surfaces 

Other surfaces have been used to generate the library of form factors for projection 
algorithms. Because they are all based on projection onto a single point, they have 
many of the same drawbacks as the hemicube method. 

Sillion and Puech [408] used a single large plane rather than a five-sided hemicube, 
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PIOURI 10.24 

Violations of the hemicube aliasing assumption, (a) Overestimating, (b) Underestimating, (c) Peri¬ 
odicity failure. 






























































1 8.5 


Form Factors 


931 



FIOIIII 18.35 

A polygon mesh beating with the hemicube pattern. 


as shown in Figure 18.26. They adaptively subdivided the plane until each cell was 
empty, fully covered by a single object, or too small to be subdivided further. A 
single-plane projection method was also used by Recker et al. [356]. 

The principal advantage of the single-plane projection over the hemicube is that 
only a single project step is required, not five. A drawback is that the technique will 
miss objects near the horizon, since there is a gap where the hemicube sides used to 
sit. We can argue that light arriving along directions that are nearly parallel to the 
local surface plane are unlikely to contribute much radiosity, so this omission is not 
much of a loss given all the other approximations inherent in the method. 

The single-plane projection method was also used by Zhou and Peng [504], 
who used two planes to distinguish between visibility information and form factor 
information. 

Hemispheres have been used for projection algorithms by Van Wyk [451] and 
Spencer [418]. The most direct approach subdivides the surface of the hemisphere 
using latitude and longitude lines, as shown in Figure 18.27. Unfortunately, coverage 
and scan-conversion are difficult to perform for this curvilinear grid. 

An alternative is inspired by the Nusselt analog: forget about discretizing the sur¬ 
face of the hemisphere, and instead discretize the base [289,418,451]. To compute 
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The single-plane projection method. 



FIOURI 11.27 

Subdivision of the hemisphere by latitude and longitude. 
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FIOURI 18.28 

Positioning the cube for the cubic-tetrahedral projection method. 


a form factor, estimate the solid angle on the hemispherical surface and then project 
it down onto the discretized base, where the number of covered cells can be counted. 
The ratio of each interior cell to the area of the base is a constant, and the ratios for 
the cells that are only partly within the hemisphere can be precomputed and saved. 
This approach can be shown to produce estimates using far fewer cells than required 
by a comparable hemicube, but the projection onto the sphere must be efficient for 
the method to be practical. 

Projection of a patch onto the hemisphere and then down onto the surface was 
also investigated by Nishita and Nakamae in their development of form factors for 
unoccluded Bezier patches [322]. They derived form factors for a patch that was 
trapped between two latitudinal and two longitudinal great circles. This is like the 
precomputed subdivision methods above, but has the advantage that the grid cell 
adapts to fit the projected surface. An analytic equation based on the angles of 
the great circles gives an approximate form factor. The algorithm becomes more 
complex if the patch is partially occluded. 

Another variation on the hemicube was developed by Beran-Koehn and Pavicic 
[37]. Rather than embed a subdivided cube into a surface so that its top face is 
parallel to the surface, as in the hemicube method, they embed the cube so that one 
comer sticks up above the surface and the three adjacent corners are in the tangent 
plane, as in Figure 18.28. They call this the cubic-tetrahedral method, since they 
use a single tetrahedral corner of a cube. This has the advantage of surrounding the 
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point like the hemicube, but only requiring three projections. The delta form factors 
for the three faces are presented in Beran-Koehn and Pavicic [38]. 


Line Distributions 

There is a great body of literature in the computational geometry field that dis¬ 
cusses problems involving the intersections of lines and surfaces, and counting those 
lines in various ways. The theses by de Berg [119] and van Kreveld [449] contain 
bibliographies that point to much of this literature. 

Two general techniques have been used in graphics to compute form factors with 
clusters of lines: ray tracing and line densities . 


Ray Tracinf 

Wallace et al. [461] observed that since the radiosity solution is usually reconstructed 
by interpolating the radiosities at vertices, then we ought to compute the radiosity 
directly at those vertices. This means that during progressive radiosity we need 
finite-to-differential form factors from a selected finite patch to all the differential 
elements (vertices) in the scene. The hemicube algorithm provides just the opposite 
information: differential-to-finite form factors from a single differential element to 
all the finite elements. 

The approach taken by Wallace et al. was to subdivide the surface of the shooting 
patch Ai into n of smaller pieces, AAj m , and then compute the form factor to each 
differential patch dAj as a sum of the form factors from each piece: 

n 

Fa x 4A 3 » Y. F ^4A 3 V(r,3) (18.77) 

m=l 

where V(i m y j) is the visibility term between subpatch A™ and dAj . 

This method raises three issues: how to test the visibility term V, how to subdivide 
the surface, and how to compute the individual form factors, 

The first issue is easily addressed by tracing a ray from the center dAj to some 
point on AAi m . If the ray strikes no other object between these two points, they are 
mutually visible, and generally we assume that this means the entire finite subpatch 
is visible to the entire differential patch. 

As we saw with the hemicube, this type of visibility assumption is risky. But the 
risk is lower in the ray-tracing method because the finite patch is smaller than the 
original patch, and because we can adaptively refine the sampling until we think it is 
accurate. Wallace et al. [461] derive the adaptive sampling using binary subdivision, 
as shown in Figure 18.29. They tracked the energy transferred from each cell to 
the differential receiver, and subdivided the cell if the energy it transferred to the 
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FIOURI 18.29 

Subdivision of a large emitter Ai into smaller pieces based on the differential receiver dAj , shown 
shaded. 


particular receiver via its form factor represented more than a threshold amount of 
energy. 

The individual form factors are computed by imagining that each subpatch AAi m 
is replaced by a disk in the same position with the same area. Wallace et al. use 
an approximate closed-form expression for the form factor from such a disk to a 
differential element. 

A disadvantage of this approach as stated is that it induces a regular sampling grid 
on the shooting patch A{. Since we assume that the patch is a pure diffuse radiator, 
the sampling pattern is unlikely to create artifacts because of its distribution on the 
source, but it may interact with the other geometry in the scene when shadow testing. 
Wallace et al. address this problem by jittering the distribution of samples generated 
on Ai from each vertex. 

A feature of this approach is that the geometry used to test visibility may be 
different than the geometry used for shading and rendering. Suppose that a scene 
contains some smooth curved surfaces that we have decided to tile into small flat 
polygons for the purposes of energy balancing. We can retain the original curved 
surface description of the scene and use it during the visibility tests, intersecting 
the ray against the original smooth surfaces. This means that curved surfaces may 
be subdivided to an arbitrary density to get good coverage by many small elements 
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without increasing the cost of determining visibility. Figure 18.30 (color plate) shows 
a quadratic surface that has been trimmed into a teapot shape with cubic splines. 
The quadratic surface was subdivided into 28 x 42 patches. The hemicube algorithm 
would probably cause severe aliasing in this picture; for example, by alternately 
finding and missing the small handle of the teapot. Ray-traced form factors were 
computed at each of the 6,086 vertices for ten steps of progressive radiosity, using 
five samples per source. 

The result of the algorithm on a much more complex database is shown in 
Figure 18.31 (color plate). Here two bays of the cathedral were modeled and 
energy-balanced, and then the pair was simply replicated three times to produce 
the complete nave of six bays. Because the original curved surfaces were retained 
throughout the process, the final rendering uses the correct surface normal due to 
the surface at each point, rather than a polygonal approximation. The original two 
bays contained 9,916 polygons and 74,806 vertices. The solution required 60 steps 
of progressive radiosity. Shooting patches were not subdivided; only one sample per 
patch was fired to determine visibility of sources from vertices. 

For this method to be efficient, it is essential it be able to quickly determine 
whether a ray intersects any objects on its way from the vertex to the shooting 
patch; this requires efficient ray-object intersections on the model. There are many 
algorithms available for accelerating this process; Arvo and Kirk [17] provide a 
survey. 


Line Densities 

All of the algorithms we have seen so far in this section have been demand-driven: 
when we want a particular form factor, we do the work to compute it. An alternative 
is to consider an algorithm that might be prohibitively expensive for a small number 
of form factors, but contains some common piece of work that is repeated for every 
calculation. Then that step can be moved into a preprocessing step, and then the 
form factor calculations themselves may prove to be efficient enough to compete 
with the methods above. If this can be accomplished, then whether or not it pays off 
depends on the costs of the various steps involved and the number of form factors 
we wish to compute. In general, the more form factors we need, the more attractive 
a preprocessing phase becomes. 

Such approaches have been considered by Buckalew and Fussell [68] and Sbert 
[377]. They both generate dense collections of lines in the environment, and then 
estimate the relationship between pairs of patches by the relative numbers of inter¬ 
sections of those lines. Buckalew and Fussell generate families of parallel lines, while 
Sbert distributes them randomly in space. To estimate the form factor between two 
patches, one method offered by Sbert is to form the ratio 
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NjAuAj) 

N(Ai) 


(18.78) 


where N(A{,Aj) is the number of lines crossing both patches, and N(Ai) is the 
number of lines crossing only patch i. 


18.5.5 Discussion 

All three basic form factors in Table 18.2 contain a term of 1/r 2 . When the radius 
goes to zero, this becomes a second-order pole in the form factor kernel, introducing 
a singularity into the computation. That is, it is a squared term in the denominator. 
One approach to handling this singularity is to multiply it with a second-order 
zero in the numerator. This technique was used by Zatz [503] who switched the 
weight function used in the inner product to a Jacobi polynomial that contained 
the appropriate r 2 term in the numerator. As pointed out by Schroder [383], this 
leads to increased work and storage. Schroder has recently investigated singularities 
in form factor calculations for Galerkin methods in detail [383]. He notes that we 
do have exact analytic solutions for abutting polygons for box basis functions, but 
higher-order bases are more difficult to handle. He suggests that a good approach 
is to switch the quadrature rule being used to carry out the numerical integration 
to a different rule over a tiled domain, so that the new rule compensates for the 
singularity. 

This section has presented only a survey of some of the more common methods for 
form factor calculation. There are many more varieties and variations. Figure 18.32 
from Cohen and Wallace offers a taxonomy, and in their book they address each of 
these methods in detail [99]. 

There is a rich body of material on form factors in the heat transfer literature that 
has only recently been mined by the computer graphics community; surveys of this 
literature are identified in the Further Reading section. 


18.5 Hierarchical Radiosity 

Radiosity programs spend most of their time performing one of two steps: computing 
form factors and solving the linear system. When using an algorithm like progressive 
radiosity, the same form factors will be calculated repeatedly every time the same 
patch is selected as the shooter. Each form factor comes at some cost, and the more 
form factors there are, the longer it takes to solve the resulting equations. If we 
could somehow cut down on the number of form factors required to propagate the 
light through the environment we should see significant savings. 
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Form factor solutions 



Hemisphere sampling Area sampling 



Hemicube Single plane Monte Carlo Contour Monte Carlo Uniform 



PIOURI 18.32 

A taxonomy of form factor methods. Reprinted, by permission, from Cohen and Wallace in 
Radiosity and Realistic Image Synthesis , fig. 4.3, p. 71. 


At first this may hardly seem possible; after all, the form factors describe the 
interaction of light between pairs of surfaces. How could we delete any of them and 
still hope to get an accurate solution? 

One way to avoid computing some form factors is to simply observe that in a 
complex environment many form factors are zero, because the patches cannot see 
each other. Geometric processing can help us avoid even consideration of pairs of 
polygons that are guaranteed not to interact with each other [169,178,435]. 

Another approach is more subtle, and draws on work performed to solve the 
classic N-body problem in physics. Consider a system of n independent massive 
objects in space. Each one exerts gravitational force on all the others, so to figure 
out where each one moves can require explicit evaluation of each of the n(n - l)/2 
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interactions. When we wish to simulate the evolution of a galaxy containing tens of 
thousands of stars (or even more), the computational costs become prohibitive. 

However, we can make a practical observation that dramatically reduces the 
cost of the problem: we usually do not need perfect accuracy. In other words, the 
precision of the gravitational field upon one particle due to another is usually limited 
by the computational hardware and software being used. Consider a test body in 
space that is separated by a great distance from a cluster of two other bodies that 
are relatively near each other, as in Figure 18.33. For any given level of desired 
precision, there is an associated distance where the magnitude and direction of the 
gravitational fields from the two bodies in the cluster are indistinguishable at the test 
particle. At that distance we can just replace the two fields by one that is twice as 
strong in the average direction. 

This idea of clustering interactions can be applied recursively to ever-larger clus¬ 
ters of bodies. The basic idea is that if the interaction between two bodies decreases 
with distance and size of the body (smaller objects exert less of a gravitational force 
than larger ones), then there will always come a distance where a pair of bodies 
may be considered a single body of larger size, and then this aggregate body may be 
combined with another body (perhaps itself an aggregate), and so on. It has been 
proven that this sort of approach can yield an algorithm with running time 0(n) 
rather than 0(n 2 ) [138]. 

It was noted by Hanrahan and Salzman [191] that the form factor problem has 
much in common with the iV-body problem: both are concerned with the interactions 
between all pairs of objects, and both the gravitational force and the form factor are 
proportional to the size of the objects and inversely proportional to the square of 
the distance between them. The two problems are not identical, since the physics 
of gravity and light transfer are different, but there is enough similarity that we can 
apply the general ideas behind the clustering algorithms for the iV-body to the form 
factor problem. The result is the hierarchical radiosity (or HR) algorithm [191,192]. 

The physical intuition for the hierarchical radiosity algorithm is that small details 
don’t matter when we are far away from something. This is the same observation that 
guided the development of multiple levels of detail in models and shading algorithms, 
as discussed in Section 15.10. Suppose that we are rendering an interior scene of a 
large office containing an overhead lamp, a desk, and chairs, and there are various 
objects scattered about on the desk. Suppose the desk is in one corner, and consider 
a patch of the wall near the ceiling in the opposite corner, as in Figure 18.34. The top 
of the desk and all the objects upon it are visible from the wall, but the illumination 
from the desk upon the wall patch will probably not change appreciably if we put a 
pencil on top of the desk. On the other hand, the illumination falling on the patch 
belonging to the table top directly under the pencil will be significantly affected when 
the pencil is added to the scene. 

From the point of view of the wall, as far as reflected illumination is concerned 
the entire desk can probably be considered a single big patch; from the point of view 
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MOURI lt.33 

A test body A and a clump of two other bodies B\ and B 2 . (a) The fields at the test particle are 
distinguishable, (b) The fields at the test particle identical to within a predefined tolerance, (c) A 
single force equal to the combination of the two in (b). 
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(a) A patch on the near wall sees a desk, (b) The desk can be considered one polygon, (c) A pencil 
on the desk top. (d) A close-up of the desktop; it needs to be finely subdivided to capture the pencil’s 
shadow. 
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of the pencil, however, the specific distribution of illumination on the table matters 
quite a lot. We need to have a detailed description of the illumination available when 
it’s necessary, but not otherwise. 

The result of using the right level of detail at different places in the scene is 
that we obtain computational savings: we can replace the many wall-desk form 
factor calculations and balancing operations with a single one. In fact, the wall 
can probably be considered a single big patch from the point of view of the desk as 
well, so we also can eliminate the form factors in the other direction. If there are m 
patches on the desk and n patches on the wall, then we can replace the ran energy 
interactions between desk and wall with a single one. The interactions between the 
ra patches on the desk with themselves still need to be considered, as we have seen 
above, but we have managed to eliminate many form factors and energy-balancing 
calculations. 

At the heart of the hierarchical radiosity algorithm is the idea of a hierarchy 
(or tree) of subdivided patches. We begin with some collection of n big patches in 
our environment, say one for each wall, one for the top of the desk, and so on. 
Then we compute the form factor between each of these patches. This step requires 
n(n — l)/2 interactions, which we said we wanted to avoid. The essential point is 
that these are big patches, often larger than we would dare use in standard radiosity 
calculations. For example, usually a wall will be subdivided into a mesh of smaller 
patches before we start a progressive refinement algorithm, in order to catch the 
variation of illumination over the wall surface (even when using higher-order basis 
functions, subdivision is needed to capture shadows and other local variations in 
radiosity). The hierarchical refinement algorithm starts with just a single patch (or a 
very coarse grid) for the wall, so although the algorithm starts by computing 0(n 2 ) 
interactions, this value of n is much lower than for nonhierarchical algorithms. 

Each time a pair of patches is examined, the algorithm considers the error that 
would be introduced if the patches were used at that size. One way of estimating 
this error is to compute the form factor from one patch to the other and compare 
it against a threshold. If the form factor is large, the implication is that a lot of the 
energy radiated by the first patch is transferred from one patch to the other; if too 
much energy is transferred, then perhaps we would be mistaken in using a single 
form factor for the entire transfer. However, if the form factor is small, then much 
less of the energy radiated by the first patch is intercepted by the second patch, and 
it seems reasonable to use a single form factor to describe the transfer. 

For example, consider Figure 18.35(a). The patches share a common border, 
so they have a large form factor. When the two patches are subdivided (Fig¬ 
ure 18.35(b)), we now have eight patches and sixteen form factors, whereas before 
we had only two of each. The patches farthest away from the common edge trans¬ 
fer relatively little energy back and forth, so they don’t need to be subdivided any 
further. But the patches along the common boundary continue to subdivide until 
they become smaller than a predefined limit. This is shown in Figure 18.35(c). Note 



(a) 



FI O U ft I 11.33 

'Two^patches undergoing refinement, and their associated hierarchies, (a) The patches before 
subdivision, (b) After one step of subdivision on both, (c) Refinement of the patches next to the 
shared edge, (d) After one more step of refinement along the edge. 
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that the patches aren’t simply subdivided without structure; we maintain a hierarchy 
identifying the parent of every patch created by subdivision. 

We can now give specific examples of the sorts of interactions we mentioned 
earlier. In Figure 18.35(d), patches A222 and B222 in the near corner need to interact 
with each other because they exchange a lot of energy. But A222 does not share 
much energy with £000 at the opposite corner. In fact A222 doesn’t really interact 
with much beyond the two smallest patches of B that are right next to it. 

Consider the two patches A2 and B 0 at the second level of the hierarchy. Their 
form factor is probably pretty small. As far as these two patches are concerned, 
there’s no reason to subdivide; a single form factor from A 2 to Bo is probably about 
as accurate as using the sixteen interactions from A2 X to Bq x for x e [0,1 , 2,3]. But 
we do need to subdivide A 2 further because of its interaction with B2. The crucial 
observation is that just because we are subdividing A 2 we don’t need to refine its 
interaction with Bo; that single form factor is fine for that transfer. So when it’s time 
to shoot energy from the patches in that corner, we can do one transfer from A 2 to 
B 0 , and then multiple transfers from A 2x to B 2xy where more precision is needed. 

We say that the hierarchy for a patch contains a root (the node at the top rep¬ 
resenting the original patch), internal nodes I within the tree, and leaves L at the 
bottom. The root may be considered an internal node since it has children . There 
are four types of interaction, illustrated in Figure 18.36: JL, ZJ, LL, and II. 

One could represent this structure with a form factor matrix that had an entry 
for the exchange between each pair of leaf nodes. Consider an LI transfer: one 
form factor represents the transfer of radiosity from a leaf to all of the leaves below 
the internal node in the hierarchy. That means that the form factor from the leaf to 
all of those other leaves would be the same: we would have constructed a constant 
block within the matrix. To see how this works, we can take a simple example in 
two dimensions (2D radiosity was popularized by Heckbert [202]). In Figure 18.37 
we show a pair of perpendicular line segments, each divided into four segments 
recursively, following the same structure as Figure 18.35 in 3D. 

The matrix of form factors corresponding to this fully subdivided pair of lines 
contains 64 entries, coupling each possible pair of leaves. The AB interactions 
occupy the upper-right 4x4 submatrix in Figure 18.37(b). 

We said above that grouping interactions resulted in blocks of constant value in 
the form factor matrix. To see this, consider that the hierarchy associated with each 
line has a root, two interior nodes, and four leaf nodes, for a total of seven nodes. 
This means that there are forty-nine possible types of interactions from A to B , and 
forty-nine from B to A. 

The forty-nine possible AB transfers are illustrated in Figure 18.38. The rows 
correspond to the size of the shooting patch, and the columns correspond to the size 
of the receiving patch. I have organized the rows and columns to correspond to the 
subdivision tree of the lines. Each colored region represents a block of constant form 
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(a) Two subdivided line surfaces at right angles, (b) The corresponding form factor matrix. 
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factors, which could otherwise be replaced by a single form factor representing the 
complete transfer between leaf and cluster or cluster and cluster. 

In general, if there are 2 k leaves for some integer fc, then there will be ( 2 k ) 2 = 2 2k 
interactions of type LL within a matrix of (2 fc+1 — l) 2 total interactions. The trick 
behind HR is that we can use just a few of these nodes; each time we select anything 
other than an LL node, we eliminate explicitly accounting for all the nodes below the 
selected ones. Following the same reasoning that led to the subdivision. Figure 18.39 
shows the matrix elements that would actually be required by the HR method for 
AB transfers (the BA transfers, in this case, would be similar). Notice the matrix of 
all leaves would have sixteen elements, but we have only needed seven. Even in this 
simple example, we have eliminated over half the form factors, and thus a significant 
amount of computation. 

Although grouped interactions may be represented by constant blocks in a matrix 
of form factors, that would be an inefficient use of storage, and we would still end 
up computing with them to balance the energy in the scene. Instead of storing an 
explicit matrix, HR creates a list of links , each of which describes an interaction 
from one patch to another. 

The links are created in order along with the refinement, so that links are built 
as the subdivision proceeds. Figure 18.40 shows the upper-right corner of the AB 
hierarchical form factor matrix again, here coded by the level in the hierarchy at 
which each element is created. The story being told by this picture is that when 
we are looking at interactions of large clusters, we only need a few links. As the 
refinement proceeds we start creating more and more links to handle the fine-scale 
interaction of small patches. 

It is instructive to consider how many links will be made in general. Hanrahan et 
al. have suggested a counting argument that the number of links will be proportional 
to n, the number of input polygons [192], 

Consider Figure 18.41, which shows a linear patch (that is, a line in 2D). We 
will assume that the subdivision threshold is set so that a patch can interact with 
another at the same level if they are not adjacent; otherwise both patches must be 
refined. This means that siblings (two descendants of the same patch) cannot interact 
because they share the midpoint of the parent patch, and first cousins cannot interact 
because they share the midpoint of the grandparent patch. In the figure, B3 and B 4 
are siblings, and B 4 and B$ are first cousins (in the figure we have used letters to 
designate the generation level, not different patches; e.g., all the C-level patches are 
of the B-level patches). Leaves must interact if they have not already 

done so. 

Given this structure, how many interactions will there be? At any level of refine¬ 
ment, we need only concern ourselves with links to patches that have not already 
been linked to by an ancestor. So any patch must connect to the children of its 
parent’s neighbors (the patches its parent couldn’t connect to), unless that link is 
forbidden. Consider patch B 4 . Its parent C 2 will interact with C 4 > so B 4 need not 
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The order of creating links during refinement, (a) The step number, (b) All the steps at once. 
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Determining links for a linear patch. 


address any node below C 4 in the tree. But C 2 is prohibited from interacting with C\ 
and C 3 , so their children are still awaiting interaction with the nodes represented by 
C 2 (and thus descended from it in the tree). So B 4 can establish links to B\ and B 2 , 
the children of C\ y and to B^ y a child of C 3 . It is prohibited from linking to B$ or B 5 
because it is adjacent to them. There was nothing special about B 4 ; all the internal 
nodes (except those on the edge) will go through exactly the same process. So each 
node connects to a constant number of other nodes, and thus the total number of 
links is proportional to the number of nodes. 

The symmetry of the situation may be a bit easier to see if the nodes are arranged 
in a circle, as in Figure 18.42(a), so that there are no edge effects. Here we have 
indicated the groupings with internal lines; these are not meant to indicate new 
surfaces. The matrix corresponding to this situation is shown in Figure 18.42(b), 
where the blocks represent constant values. 

We can now specify the HR algorithm in a bit more detail. We will present the 
algorithm as a collection of pseudocode fragments, following the fine organization 
presented by Cohen and Wallace [99], 

We will actually provide quite extensive pseudocode in this section. The purpose 
of the code is not to suggest actual programming details, but to offer an unambiguous 
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PIOURI 18.42 

A refined circular patch, (a) The subdivision. Internal lines represent clustering, not new surfaces, 
(b) The resulting form factor matrix. Blocks indicate constant values. 
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struct Node { 


float 

B 9 ; 

gathering radiosity 

float 

B 

shooting radiosity 

float 

Y gi 

gathering importance 

float 

V,; 

shooting importance 

float 

Ei 

emission 

float 

M 

patch area 

float 

Pi 

reflectivity 

struct Node * 

N; 

pointer to list of children 

struct Link * 

Li 

pointer to list of links 


} 


PIOURI 18.43 

The Node structure for hierarchical radiosity. 


description of the algorithm. Since the HR technique (and the variants we will also 
discuss) represent the current state of the art in this form of solution process, I feel it 
is as important to describe the mechanics of these algorithms as it was to describe the 
derivation of important equations in previous chapters. These code fragments will 
bear some resemblance to the structure of an actual system, but we will not address 
any of the critical implementation details that are essential to a working system; the 
references provide a wealth of information for the implementor. 

There are two types of structures in the system: a Link and a Node. A node 
contains information about a node in the hierarchy, and a link represents a selected 
transfer of energy between nodes. These two structures are shown in Figures 18.43 
and 18.44. 

We will indicate an element of a structure with the dot notation; e.g., p.E for the 
emission field for a node p. 

Each Node structure contains a gathering radiosity B g , which is the radiosity 
it has received but not yet sent into the environment, and a shooting radiosity B sy 
which is the radiosity it presents to the world at any given moment. A Node also 
contains an emission term E , an area term A, and a reflectivity p . It contains a 
pointer AT to a list of subnodes if this node is subdivided (initially N is set to a 
default such as NULL), and a pointer L to a list of links connecting this node to other 
nodes. The fields Y s and are used to store importance; these will be discussed in 
Section 18.6.3. 
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struct Link^T 



struct Node* 

p; 

pointer to shooting node 

struct Node* 

q; 

pointer to gathering node 

float 

Fqp* 

form factor 

struct Link * 

L; 

pointer to next link 


} 


PIOURK 18.44 

The Link structure for hierarchical radiosity. 


A Link structure represents a transfer of energy from a shooting patch p to a 
gathering patch q ; the receiving patch q is the patch with which the link is stored. 
The Link contains the form factor F qp for this transfer, and a pointer L to other 
links. So to gather radiosity for a node n, we look at each link L and compute 
L.p.Bg x L.Fqp . 

The calling dependence of the pseudocode routines is shown in Figure 18.45. We 
will give explicit listing for all the routines except those in parentheses. 


18.6.1 Om Stop of HR 

The general idea for using hierarchical refinement is that we start with the n large 
patches provided by the designer and run them pairwise into a refinement routine. 
That routine either builds a link between the two patches if that would be acceptable, 
or it subdivides one or the other and then calls itself to examine the new patches 
for possible linking or further refinement. When the links are established, we call 
a solution program to transfer the energy around on the links until the system is 
converged. 

The driver for the whole operation is called SolveSHR (SHR stands for simple 
hierarchical radiosity) and is listed in Figure 18.46. This is a very simple routine: it 
just initializes the system and then calls a routine to solve it. 

The first step in initialization is to assign the initial patch emittances to the unshot 
radiosities. We then pass through all the input patches and build the necessary links 
between them. Since each patch may be refined, we will end up associating a 
subdivision hierarchy with most of the patches. The root of this hierarchy is called 
the root node for that patch. 

To build a complete set of links, we need to check all pairs of nodes in both 
directions. If one or the other nodes needs subdivision, then it’s subdivided and 
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PIGUftl 18.48 

The calling dependence of HR pseudocode. 


SolveSHRQ { 


Solve simple hierarchical radiosity. 


Zni£$&() 

BuildLinks() 


Create initial link structure. 


SolveHR() 


Call the system solver once. 


PIOURI 18.46 

Pseudocode for SolveSHR. 


link-building is tried again. The routine InitBs listed in Figure 18.47 initializes the 
shooting radiosity, and BuildLinks in Figure 18.48 builds the links between each 
pair of polygons. 

After initializing the shooting radiosity, the only initialization job left is to call 
Refine with pairs of nodes. Figure 18.49 gives the pseudocode for Refine. The 
first thing that Refine does is to see if the two nodes it is given can be linked right 
away. To determine this, it calls an auxiliary function, OKtoLinkNodes, listed in 
Figure 18.50. 
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InitBs() { 

Initialize shooting radiosity . 

for all nodes n 

n.B s <— n.E 

endfor 

Set initial shooting radiosity to emission. 

} 

PIOURI 18.47 

Pseudocode for InitBs. 


BuildLinksO { 

Build initial set of links. 

for all nodes a 

for all nodes b 

Refine (a, b) 

endfor 

endfor 

Build links between each pair of nodes. 

} 


FI9UII 18.48 

Pseudocode for BuiIdLinks. 


Hanrahan et al. [192] call OKtoLinkNodes an oracle function , because to prop¬ 
erly do its job, it needs access to more information than we have. The job of this 
function is to determine if linking these two patches would cause significantly more 
error than subdividing them and building links between the smaller patches. This is 
very important, because this single decision controls which links get built. 

We would like the oracle to decide on the need for subdivision without going 
through the expense of actually subdividing the patches. Therefore it is based on a 
couple of simple heuristics (we will see some more advanced forms of this function 
later on). As shown in the pseudocode, we allow the link to occur if the two patches 
are physically smaller than some threshold, or if an estimated form factor is below 
some threshold. We can use any computationally convenient means for estimating 
the form factor; Hanrahan et al. used a solid-angle approximation similar to the 
approaches in Section 18.5.3. 

Returning to Refine, if the nodes can be linked, then we call the routine Link 
to establish the connection. A call of Link (a, b) adds a link node to the list at 
6 , indicating that it receives energy from a. This is all that needs to be done, and 
Refine returns. 
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Refine (a,6) { 

if OKtoLinkNodes (a.b) then 
Link(a,&) 
else 

ChooseAndDivide (a, b) 
if node-a then 

for each child r of a 
Refine (r,6) 
endfor 

else if node-b then 
for each child r of b 
Refine (a,r) 
endfor 
else 

Link (a, b) 
endif 
endif 

} 

PI O II R I V 8.49 

Pseudocode for Refine. 

OKtoLinkNodes ( a, b) { Is it okay to link these two nodes? 

if a.A < ca and b.A < ca 

return True They're small enough to be okay. 

endif 

if EstimateFormFactor (a, b) < e/r 

return True The form factor is small enough. 

endif 

return False „ , f f 

Subdivide one and try again. 


IIOUII 18.80 

Pseudocode for OKtoLinkNodes. 


Establish links between nodes a and b. 
Linking these nodes is fine. 

Pick a node to subdivide. 

Check links for descendants of a. 

Check links for descendants of b. 

Nodes are not subdividable after all. 
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SolveHRO { Balance energy using HR links. 

while not converged 

for every root node r ^ f _ 

Gather up energy from all incoming links. 
GatherRad(r) r sy i s 

endfor 

for every root node r 

PushPyllHa^Xr) Give energy to children; collect their energy back. 

endfor 
endwhile 

} 


mouri is.si 

Pseudocode for SolveHR. 


But if the nodes cannot be immediately linked, then it must be because the oracle 
determined that it would create too much error to link them at this level, and one or 
the other needs to be subdivided. We call a routine called ChooseAndDivide that 
examines the two nodes and chooses one of them to be subdivided. Now, because 
each root node is tested against all other nodes, the node selected 
by ChooseAndDivide may have already been subdivided; if it isn’t, the routine 
creates the four subdivided children before returning. It is also possible that 
ChooseAndDivide may determine that neither of the nodes can be advantageously 
subdivided; in this case it returns a value that does not point to either a or b. 

When Refine resumes after this call, it looks to see which node has been se¬ 
lected and subdivided by ChooseAndDivide. Refine then calls itself recursively 
to establish links between the unaffected node and the children of the subdivided 
node. If neither node was selected for subdivision, they are simply linked together; 
overruling the oracle. 

When the links are finally established, Refine returns, and the next pair of 
nodes are linked. When all pairs have been linked up, control returns to SolveSHR, 
which calls SolveHR to actually solve for the radiosity. In a traditional radiosity 
system this is where a matrix would be inverted. The routine SolveHR is given in 
Figure 18.51. 

There’s not much to SolveHR. It visits every root node and instructs the patch 
(and hierarchy) associated with that node to gather energy from the other patches in 
the environment through the routine GatherRad. Now comes the tricky part where 
we need to make sure that the radiosity gathered at different levels of the hierarchy 
is correctly distributed. We manage this process with the routine PushPullRad, 
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GatherRad(n) { 

Get radiosity into this node. 

71 . Bg 4 - 0 

No radiosity gathered yet. 

for each link L into n 


Tl.Bg 4 - 71 . Bg ~|“ 

71 . p X [ L.Fqp x L.p.B s ] 

Accumulate some radiosity to shoot. 

endfor 


for each child r of n 


GatherRad(r) 

endfor 

Accumulate for each child. 

} 


MOURI 18.52 

Pseudocode for GatherRad. 


which is applied to each root node. SolveHR runs through this look again and again 
until the energy distribution converges, just like every other radiosity algorithm. Let’s 
look at the two routines involved in this process, starting with GatherRad, listed 
in Figure 18.52. 

GatherRad visits each link that transfers energy into the given node and gathers 
energy from the node at the other end of the link. Since the shooter is node p, the 
radiosity absorbed and re-radiated at n is over link L simply n.p x [L.F qp x L.p.B s ], 
since n is the same as q for this link. If n has any children, they need to gather their 
energy too, so we call GatherRad recursively. Remember that these child nodes 
are coincident with n, though they are smaller. They represent energy transfers that 
were too important in some way (as determined by the oracle) to approximate with 
just a single big transfer to the parent node. 

When all the energy has been gathered, we need to distribute the light gathered at 
different levels throughout the tree at each node before we can start gathering again. 
This process is accomplished by the routine PushPullRad, listed in Figure 18.53. 

The heart of PushPullRad is how it sends radiosity down the hierarchy (the 
push part) and how it combines the radiosity coming up the hierarchy from a node’s 
children (the pull part). Let’s look at the pull part first. 

Recall that radiosity is power per unit area: 

B = % (18.79) 

r 1 

If we take n coplanar, abutting patches, each of which has power $ c and area A c , 
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PushPul lRad ( n, B down ) { Node n inherits radiosity Bdown. 

if 71 is a leaf then Send up my emission, reflection, and 

Bup <- n.E + n.B g + B down _ inheritance _ 

else 

Nothing collected yet. 

Bup Q _ 

for each child r of n Get child's radiosity. 

Bup B up + ( r.A/n.A ) x Add w child's radiosity scaled by its 

_ PushPullRad(r, n.B g + B down ) relative area. _ 

endfor 
end if 

n.B s B up />w passing up; I want to shoot it. 

return (B up ) , 

Awa p^ss wy radiosity back up. 


HOUR! 18.53 

Pseudocode for PushPul 1 Rad. 


then the total radiosity of the aggregate is the total power divided by the total area: 


B = 


E"=i 

E^=i BcAc 
A 



( 18 . 80 ) 


where A is the area of the parent. This last expression is just what is computed inside 
PushPul lRad. It tells us that the radiosity due to the children of a node is simply 
the radiosity of each child weighted by the relative area of the child. To find the total 
radiosity at a given level, we only need to find this contribution from below, and add 
in the contribution from this level. 

The push part is much simpler. Since radiosity is power per unit area, and we 
assume power output is constant across the node, the radiosity of any subpatch is the 
same as the radiosity of the parent patch, since the ratio of energy to area remains 
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constant. If the child has an area that is a fraction a of the parent’s area A, then 


R 

~ A = 


(18.81) 


Now we can look at PushPullRad as a simple distributor of energy. We start 
at the top of the hierarchy, and look up the gathered radiosity B g at that level. That 
radiosity is inherited by each child, so we recursively call PushPullRad passing 
down this radiosity. If these children are internal nodes, then they add their gathered 
power to what they inherited and pass the total on downward. Finally we reach a 
leaf node; it has inherited the sum of the radiosity from every level above it. The 
leaf adds in its own gathered radiosity plus its emission, and that becomes the new 
shooting radiosity for the leaf. It stores this locally and then sends the result back 
up. The parent node now does nothing but combine the shooting radiosities of each 
of its children (weighted by relative area); the result is that node’s own shooting 
radiosity. It saves it locally and sends it back up the tree, and so on until we reach 
the root. 

At this point the hierarchy for each root node makes sense: each internal node 
contains the area-weighted radiosities of its children, and each leaf node contains the 
total radiosity gathered by the entire path of the tree above it. Localized transfers to 
intermediate nodes stay localized to that node and its descendants, but are included 
in the averages computed by its ancestors. 

Now that all the trees are balanced, control returns to SolveHR, which checks for 
convergence and calls GatherRad and PushPullRad as many times as necessary 
until an equilibrium solution has been found. 

Figure 18.54 (color plate) shows three images of an office scene at different levels 
of refinement. The size of each patch is indicated by its image outlined in white. 

A summary overview of hierarchical radiosity is shown in Figure 18.55. At the 
top we come in with two patches and a proposed link from one to the other. First we 
test each patch to see if it is smaller than a size threshold, and we estimate the form 
factor to see if it is below threshold. If the patches and the form factor are small 
enough, then we exit the loop and create the link. Otherwise we subdivide the larger 
patch, create possible links from each subpatch to the smaller input patch, and run 
each of these four new pairs of patches through the same process. 


18.6.2 Adaptive HR 

The routine SolveSHR in Figure 18.46 builds a hierarchy and then solves the result¬ 
ing radiosity relationships. Recall that the oracle function, called OKtoLinkNodes, 
compared two nodes and decided whether or not to link them. We said that this 
function controlled the structure of the hierarchy because it told us when to sub¬ 
divide a patch and when we could build a link. The function only had access to 
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An overview of hierarchical radiosity. 
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the geometry of the nodes, so it made a decision based on patch sizes and form 
factors. Hanrahan et al. [192] called this F refinement in reference to the form factor 
component. 

But once the system has been solved from this particular set of links, we have 
learned something important about the environment that we didn’t know before: an 
estimate for the distribution of energy. What we would really like is for each link in 
the system to carry the same amount of energy. Once the system has been solved for 
a particular set of links, we can pass through the structure and make changes, adding 
links where we were too conservative the first time. When all links carry the same 
amount of energy, there’s no advantage to either shooting or gathering, and nothing 
to be gained by shooting from the brightest source as in progressive radiosity: every 
link has just as much effect as every other. 

As long as we’re moving through the structure again, we can also use a smaller 
threshold in the form factor test; nodes that were not subdivided before will have 
to be subdivided in order to get small enough to satisfy this new threshold. This 
process is an example of a general technique called multigridding,. The idea behind 
multigridding is that we can first compute a coarse approximation to the solution 
using a rough grid, and then slowly refine the grid into smaller and smaller elements. 
This method will often converge more quickly than starting with the smallest-size 
cells in the first place, because the coarse solution gets us close to the correct answer 
at low cost; when the grid is refined, it is usually pretty close to the right answer 
already. 

This combination of geometric and illumination information to control the hier¬ 
archy is called BF refinement. Implementation of this method requires only two new 
routines and a replacement for the main control. The main program, now called 
SolveAHR (for adaptive hierarchical refinement ), is listed in Figure 18.56. 

The routine SolveAHR is very similar to SolveSHR, except that after initial¬ 
ization and one pass through the solver, it runs through all the links in the system 
looking for any that can be refined. If any links are changed, then the system is 
re-solved with the new configuration, and the links are scanned again. The pro¬ 
cess repeats until the system has reached an equilibrium with respect to both the 
distribution of light and the power carried by the links. 

The refinement test for links is called Ref ineLink and is listed in Figure 18.57. 
As with the patch-based test Refine, the routine Ref ineLink calls an oracle to 
determine if a link needs to be refined. The routine OKtoKeepLink provides the 
FF-refinement version of the oracle, and is listed in Figure 18.58. 

The oracle OKtoKeepLink checks for one of three conditions that a link must 
satisfy in order to remain unrefined: the patches involved must be small, the shooter 
has no power, or not enough radiosity reaches the gatherer. This last step is the 
key to FF-refinement. As we mentioned earlier, the thresholds for these tests may 
be large at the start of the process, and then gradually reduced to drive the system 
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SolveAHR() { 

Solve adaptive hierarchical radiosity. 

InitBg() 

BuildLinks() 

Initialize and create initial links. 

done*- False 

while done = False 


done*- True 

SolveHR() 

Solve current system. 


for all links L 


if RefineLink(L) = True 

done<r- False If any link was refined , re-solve later: 

endif 
endfor 
endwhile 

} 


PIOURI 18.56 

Pseudocode for SolveAHR. 


toward a more accurate solution. A result of applying BF refinement is shown in 
Figure 18.59 (color plate). 

Observe that the hierarchical radiosity algorithm is inherently a multiresolution 
technique: at any given time, different parts of the algorithm are dealing with the 
same surfaces in differently sized pieces. As we saw in Chapter 6, wavelets are a 
natural means for discussing multiresolution phenomena. Gortler et al. [167] have 
shown how to apply wavelet bases to the hierarchical radiosity algorithm, creating 
a whole family of different higher-order radiosity algorithms based on different 
wavelet basis sets. This works well because the form factor matrix in radiosity 
problems (which still exists, though implicitly, in hierarchical methods) is mostly 
smooth. Wavelets are able to describe this matrix by capturing the large smooth 
regions with large, smooth functions, and then capturing fast local changes with a 
few additional localized bases. 


18*6*3 iMportcmce HR 

Recall the idea of importance from our discussion of integral equations in Sec¬ 
tion 16.9.3. We saw that if we had an importance function defined on the same 
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RefineLink(L) { 

if OKtoKeepLink(L) 
return False 
endif 

no<ie«—ChooseAndDivide (L.p, L.q) 
if node = L.p 

for each child r of L.p 
Link (r, L.q) 
endfor 
else 

for each child r of L.q 
Link (L.p, r) 
endfor 

DeleteLink (L) 
return True 

} 


HOURI 18.57 

Pseudocode for Ref ineLink. 


OKtoKeepLink (L) { _ 

if L.p.A < €a and L.q.A < ca 
return True 
endif 

if L.p.B s x L.F qp < e B F 
return True 
endif 


return False 



Pseudocode for OKtoKeepLink. 
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If link needs refining , do so and return True. 
No refinement needed. 

Pick a node. 

Build new links to L.q. 

Build new links to L.r. 

Get rid of old link. 


Is it okay to accept this node? 

The patches are small enough to be okay. 

There y s not enough energy to be transferred. 


This link needs refinement. 
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domain as our unknown function, we could use it to guide our process of solving 
for the unknown. 

This idea has been applied to hierarchical radiosity by Smits et al. to create an 
importance-driven hierarchical radiosity algorithm [414]. The algorithm is quite 
easy to implement and can dramatically improve the efficiency of the solution. 

Recall the fundamental identity from Equation 16.178 that relates an unknown 
function x and its driving function g to the unknown importance w and its driving 
function p : 

(x\p) = (g\w) (18.82) 

The importance w and the solution x are unknown in this function, while the given 
potential p and the driving term g are both known. The product of one known 
and unknown matches the product of the other known and unknown. Thus we 
sometimes say that if we knew the importance we would know the solution; the 
problems are closely related. 

In terms of radiosity, the unknown x is the radiosity £, and the driving function 
is the emittance E. We define R to be the driving function for importance, and Y to 
be the importance. Then we can restate our relation above as 

(B\R) = (E\Y) (18.83) 

(the notation varies: Smits et al. used (4>| R) = (S| \£) [414], and Cohen and Wallace 
used (B\R) = (S |T) [99]). Expanding out this braket into traditional radiosity- 
style sums over discrete elements gives us the related pair of equations: 

n 

Bi = Ei + Y jPl B k Fj, k 

k =i 

n 

Y i = R i + '52p k B k F k , i (18.84) 

k=l 

Note the switch in the indices in the two right-hand sides. 

We can also write these expressions in matrix notation. For a form factor matrix 
K, we find 


E = KB 

R = K*Y (18.85) 

The need for the transpose is expected because the importance and the radiosity are 
adjoint terms, and the adjoint of a real matrix is its transpose. Normally, we are 
given the emittances E, and we try to find an approximate solution B to B using 
an approximate form factor matrix K^K + AK. Then the approximate solution 
satisfies 

E = KB = (K + AK)B = KB + AKB 


( 18 . 86 ) 
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which can be rearranged as 

E - AKB = KB (18.87) 

The first part of Equation 18.86 tells us that we can match the exact emittances 
using an approximation composed of an approximate transport operator K and an 
approximate solution B. On the other hand, Equation 18.87 says that we can match 
the approximate emittances E — AKB by using the approximate solution and an 
exact transport operator KB. 

Our goal in all radiosity algorithms so far has been to minimize the error in our 
solution; that is, we have tried to make B - B as small as possible. But this is 
not always necessary. Consider an interior office scene, composed of a room with 
bookcases, tables, chairs, and so on. If we’re standing in the doorway, then we 
probably cannot see the back of the desk, the insides of the drawers, and many other 
surfaces. As far as computing an image is concerned, if we can’t see these surfaces, 
then we really don’t care what their radiosity is. Of course, the effect of their radiated 
energy onto the surfaces we can see must be present. For example, for the scene just 
described, the back panel of the desk might be adequately represented by just a single 
polygon that absorbs and reflects the same energy as a finely subdivided back panel. 
The difference is that we can’t see the back, so we don’t care if the approximation 
is visually acceptable as long as its light propagation is accurate. Going one step 
further, the interaction of light inside the closed desk drawers is completely irrelevant 
to us. In F-refinement HR we would compute form factors for all the patches inside 
the drawers; even in BF-refinement, if a little light was leaking into the drawers, 
then we might even end up processing it. We would like to focus our attention on 
getting a good estimate of the radiosity that we can actually see in the environment, 
and not bother with overwhelming detail in invisible parts of the environment. 

We can focus our attention (and computing resources) on the visible parts of the 
scene by defining an image function v(B); think of this function as telling us what 
linear combination of radiosities must be used to find the radiance at a particular 
pixel. The importance of each radiosity value to the pixel value is exactly the driving 
importance term R. This observation allows us to easily derive Equation 18.83 in 
terms of radiosity and importance [41^J: 

v(B) = R‘B 

= Y*KB 

= Y*(KB) 

= Y*E (18.88) 

The error we want to minimize is not simply B — B, which tries to get a good 
approximation for the radiosity everywhere in the environment, but rather the visible 
part of that error: v(B — B). In other words, it’s okay if there are errors in the 
approximation where we can’t see it; as long as the cumulative impact of the invisible 
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part of the environment is accurate, it doesn’t matter how the radiosity is distributed. 
Expanding this error and using the identities derived above, we find 

v(B -B) = R/B - R*B 

= Y*E - (Y*K)B 
= Y*E-Y‘(E-AKB) 

= Y*AKB (18.89) 

So Y* AKB is the error in our image due to using the approximations K and B. 
This is what we want to minimize: the error in the important radiosity , not just the 
error in the radiosity. 

These ideas are demonstrated in Figure 18.60 (color plate). Here a maze is 
illuminated by a number of light sources, and the solution is shown in red in Fig¬ 
ure 18.60(a). If an image is rendered from a point of view near the bottom, as 
indicated by the small eye in Figure 18.60(b), then we can solve for the impor¬ 
tance from that view, shown in green. Notice that for this viewpoint there are a 
lot of unimportant patches in the model. If we superimpose the two solutions as 
in Figure 18.60(c), we see in yellow that the patches are both important and emit 
significant radiosity. Those are the patches we care the most about for the given 
viewpoint. 

The problem here is that we don’t know Y; in fact, it is just as hard to find as B 
itself. So our algorithm will find an estimate B for P, and at the same time compute 
an estimate Y for Y. Then we can compute the error 

Y'AKB (18.90) 

and use that to refine the solution, improving our guess of both functions at once. 

To implement this approach we need to determine how to distribute importance 
up and down the hierarchy, just as we needed to distribute radiosity after each 
gathering step. The essential observation comes from the form factors. Suppose 
that patch P* is a child of patch P/, and there is another patch P* far away, as in 
Figure 18.61. Then we can observe that Pk will capture about the same fraction of 
radiated energy from both Pi and P/: 

F i9 k*Fi, k (18.91) 

Similarly, consider power leaving Pk for P/. The amount of this power caught by Pi 
is the relative area of the smaller patch to the larger: 

F k ,i « ^-Fk, t (18.92) 

Armed with these approximations we can evaluate the transfer of radiosity and 
importance downward in the hierarchy from B[ to B *, and from Y/ to Y*, using 
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P, 




PIOURI 18.61 

Parent and child patches / and i viewed from another patch k. 


Equation 18.84. Assuming that the assigned importance of Pi is also based on the 
relative area it occupies on its parent (that is, Pj(Ai/Aj )), then we find 


Bi % Bi 



(18.93) 


This tells us how a subpatch inherits radiosity and importance. The result for radios¬ 
ity is that the child simply inherits the value from its parent, which is the property 
we used in PushPullRad. On the other hand, importance is area-weighted, so a 
child has an importance relative to its parent given by the ratio of their areas. 

Working now from child to parent, we can observe from the definition of the 
form factor and reciprocity the following relationships: 



Plugging these into Equation 18.84 gives us 





(18.95) 
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SolvelmpHR () { Solve an importance-driven HR problem. 


InxfeBs () 

Initialize gathered radiosity. 

for all root nodes r 

r.JTs «- r.R 
endfor 

Set initial shooting importance. 

BuildLinks() 

Build initial links. 

^BFI <— tBFI, 0 

Start with a large error value. 

while cbfi > tt 

Solve until error is small. 

SolveDual() 

Estimate a new solution. 

for all links L 

R^f i H I k ( L ) 

endfor 

See if any links can be improved. 

£bfi £bfi — Ac bfi 

Reduce the permissible error. 

endwhile 

} 

Pseudocode for SolvelmpHR. 


Again we find that radiosities are area-averaged as we work our way up (also used 
in PushPullRad), and that importances are simply summed. 

In other words, radiosities and importances are propagated up and down the tree 
in exactly opposite ways, demonstrating again their adjoint relationship. 

Now that we know how to distribute importance after a gathering step, we can 
create a new oracle that includes the importance of a link into its decision. The 
result is called BFI refinement . The only change required to the data structures is the 
addition of an importance-shooting element Y s and an importance-gathering element 
Y g to the Node data structure. A single link still suffices to relate two patches, but 
as we can see from Equation 18.84, radiosity and importance travel over the link in 
opposite directions. 

To get the process rolling, we start with a new driving function SolvelmpHR, 
listed in Figure 18.62. 

SolvelmpRad initializes not only the element radiosities as in HR, but also the 
shooting importance from the assigned importance. This brings up the question of 
what a good value for the assigned importance might be. One reasonable suggestion 
is to use the magnitude of the solid angle of the visible part of the patch projected 
onto the viewing surface, as in Figure 18.63. This solid angle can be determined by 
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Assigning importance based on projected solid angle. 


any rendering method; effectively, it’s the form factor from the patch to the pixel. 

Once initial radiosities and importances have been assigned, SolvelmpRad sets 
the acceptable error threshold to a large value and calls 
SolveDual, which solves for both the radiosity and the importance as carried 
by the current links. We then call Ref inelmpLink, which uses a new oracle 
OKtoKeepImpLink that implements B FI-refinement. The new oracle is listed in 
Figure 18.64; it’s basically the old oracle with the inclusion of Equation 18.90. 

To form estimates for both radiosity and importance, calls 

SolveDual, listed in Figure 18.65. 

The operation of the dual solver is similar to that of the basic HR solver. We 
first pass through all the nodes and gather radiosity, but we also shoot importance 
across the link at the same time, using GatherRadShootlmp. Then we balance 
the radiosities and the importances; for convenience we have left PushPullRad 
alone and added an importance resolver PushPullImp. Let’s look at these in turn. 
GatherRadShootlmp is listed in Figure 18.66. 

GatherRadShootlmp is just about the same as GatherRad, except that we 
also shoot importance over the same link that we’re using to gather radiosity. 

Finally, PushPullImp is listed in Figure 18.67. This routine distributes impor¬ 
tance downward by area-weighting, and simply sums importance coming back up, 
implementing Equations 18.93 and 18.95. 
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OKtoKeepImpLink( L) { 

Is it okay to accept this node f 

if L.p.A < e A and L.q.A<€A 


return True 

The patches are small enough to be okay. 

endif 


if L.p.B s x L.F qp x LY^ < c B f 


return True 

There’s not enough energy to be transferred. 

endif 


return False 

This link needs refinement. 


} 


PIOURI 18.64 

Pseudocode for OKtoKeepImpLink. 


SolveDual() { 

Solve for radiosity and importance. 

while not converged 

Repeat until equilibrium. 

for every root node r 


r.Y g <- 0 

No importance gathered yet in this pass. 

endfor 


for every root node r 


GatherRadShootlmp (r) 

Gather and shoot over all links. 

^^ndfor 



for every root node r 

PushPullRad(r) , 

Share up and down trees. 

PushPullImp(r) 
endfor 
endwhile 

} 


PIOURI 18.68 

Pseudocode for SolveDual. 
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GatherRadShootlmp(n) { 

Gather radiosity and shoot importance. 

Tl.Bg 4 — 0 

No radiosity gathered yet. 

for each link L into n 


Tl.Bg ^ Tl.Bg- h 

Gather radiosity from p. 

Tl.p X {L.Fpq x L.p.Bg] 


LbX <- L,p,Y+ 
n.p x [L.F qp x n.Y ] 

Shoot importance to p. 

endfor 

for each child r of n 


GatherRadShootlmp(r) 

endfor 

Process my children. 

ondwhile 

} 

FIOIIRI IS.66 


Pseudocode for GatherRadShootlmp. 



PushPullImp (n, Ydown) { 

Node p inherits importance^ down- 

if n is a leaf then 

Fup ^ Tl.R “1“ Tl.Yg “I - l^own 

Send up my importance , reflection , and 
inheritance. 

else 

Yup 0 

Nothing collected yet. 

for each child r of n 

Yf 4 ( n.Yg "i Ydown) ^ t.AJti.A 

Get child’s importance. 

Kp <- V up + PushPullImp(r, Y t ) 

Add it in scaled by that child’s relative area. 

endfor 


endif 


n.Y s 4 Yup 

Save what I’m passing up. 

return (y up ) 

} 


M8URI IS.67 

Pseudocode for PushPullImp. 
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The power of importance-driven refinement is demonstrated by Figure 18.68 
(color plate), which is a model of a maze sitting on a table inside a larger maze. In 
Figure 18.68(a) we see the meshing constructed for the maze from a point somewhat 
above and behind it, and in Figure 18.68(b) we see a smoothly reconstructed version 
of the same image. It is informative to view the radiosity and importance solutions 
from different points, so the remaining images in this figure move the camera back 
and away from the maze, but they always show the same solution that was generated 
for Figure 18.68(a). In Figure 18.68(c) and (d) we see the radiosity and importance 
solutions for the maze from farther away. Note that the meshing where the walls 
join the floor is much denser in the region where the importance is high. Similarly, 
the quality of the mesh on the table is very good near the front, where it occupies 
much of Figure 18.68(a), but the table becomes very coarsely meshed where it is 
not visible. In Figure 18.68(e) and (f) we see the radiosity and importance solutions 
for the maze from even farther back, so we can also see the larger maze in which 
it sits. In the radiosity solution huge, brightly lit walls in the near part of the maze 
are completely unrefined, because they are unimportant to the image in (a). Notice 
that the complex block sculpture in the front-left also causes no refinement. The 
wall facing us just right of center is slightly refined because some of the illumination 
falling on it eventually makes its way to the maze. The level of subdivision indicates 
that not much light from the wall contributes to the image in (a). The importance 
solution in (f) shows us why the block sculpture in the near-left of (e) is unrefined: 
from the point of view that generates (a), the sculpture is irrelevant. 


18.6.4 Discussion 

The hierarchical radiosity algorithm and importance-driven refinement are important 
practical tools for solving energy transport problems in image synthesis. Our goal 
in this section has been to demonstrate the basic ideas and show how they may be 
linked to improve the efficiency of finding an equilibrium solution. 

Our example for determining importance was based on the direct contribution of 
a patch to the final image. Although useful, this is only one way to assign importance. 
We can attach importance to any feature of the model where it is important to have 
accurate sampling: on the surface of small but aesthetically important objects, on 
objects that are visible only through reflection, or objects that are completely invisible 
but contribute significant illumination to surfaces that are visible. 


18.7 Mushing 

All of the methods we have seen above break down the environment into small 
patches in order to compute a radiosity solution. This subdivision of the environment 
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is called meshing. The problem of meshing is not unique to radiosity: the entire 
engineering discipline of finite element analysis is intimately concerned with different 
meshing algorithms. 

The quality of a mesh directly affects the quality of the final radiosity solution. 
Cohen and Wallace offer an excellent analysis of the errors that occur when a 
scene is not properly meshed. Unfortunately, the definition of “proper” depends on 
many factors, including not only the scene description itself but particulars of the 
computing hardware on which the simulation is being run. Visual artifacts caused 
by insufficient meshing include blocky shadows, Mach bands, and missing features 
[98]. 

Surfaces that are too large for the illumination signal falling on them (say a 
shadow edge in the middle of a patch) must be subdivided. This subdivision may be 
regular , such as the subdivision of a rectangle into four smaller equally sized rect¬ 
angles. Some researchers have noted that Delaunay and Voronoi diagrams produce 
a very uniform mesh, and may be used instead of regular subdivision [387,426]. 

An elegant solution to many meshing problems is the technique of discontinuity 
meshing (or DM) [202,275]. The idea is that if there are easily visible shading 
features in the environment, then the mesh should adapt to those features. Usually 
such features are shadows, highlights, and other local phenomena on surfaces that 
represent discontinuities of the radiosity distribution or one of its derivatives. Dis¬ 
continuity meshing attempts to place the boundaries between mesh elements right at 
those locations. Figure 18.69 (color plate) shows the basic idea for a small pyramid 
illuminated by a pair of lights. In the upper row we see the scene from the side and 
above, and the discontinuities are marked in colored lines. The second row shows 
the progress of discontinuity meshing the base plane, while the lower row shows the 
progress of regular adaptive subdivision based on quadrilaterals. Note how much 
more closely the discontinuity mesh matches the features in the radiosity function. 

An example of the difference meshing can make is shown in Figure 18.70 (color 
plate), which was computed using the algorithm by Lischinski et al. [275]. The 
figure shows the shadow cast by a window on a wall. Note how the meshing in the 
standard solution interferes with the pattern of the shadows. 

A related set of images are shown in Figure 18.71 (color plate). The picture 
on the right was computed using hierarchical radiosity; the one on the left with 
discontinuity meshing. Note how much more crisp the shadows have become, 
including the fine detail on the table top and under the near chair. Also, note the 
much stronger presence of a discontinuity along the top of the window and door in 
the HR solution. 

Rather than compute discontinuities implicitly from the geometry of the scene, 
we can try to construct isolux contours on the scene surfaces: like a topographical 
map, these contours indicate a curve of constant radiance on a surface. If those 
contours can be found analytically, then it may be easier to find discontinuities. A 
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method for determining the analytic distribution of light in some situations has been 
described by Drettakis and Fiume [125,126]. 

Because most radiosity programs are based on meshing, the results must be 
smoothed before display to avoid a blocky-looking image. Typically the patch 
radiosities are averaged at the vertices, and then the vertex radiosities are interpolated 
using a method like Gouraud shading, which does a simple linear interpolation across 
pairs of radiances during display. Effectively this is a process of reconstruction of a 
continuous-time signal from a set of samples, and Gouraud shading is one form of 
linear interpolation. 

We know from Unit II that we can reconstruct a function much better with 
a sine function (or even a Gaussian bump) than with linear interpolation, which 
corresponds to a tent function. A visible drawback of Gouraud interpolation is that 
we end up smoothing where we don’t want to smooth. Suppose we had two adjacent 
patches, one in bright light and one in dark shadow, sharing an edge generated by 
discontinuity meshing to follow the edge of a sharp shadow. We certainly don’t want 
to blend colors across this edge. A method for building smooth reconstructions where 
we want smooth signals, but which also supports abrupt discontinuities, has been 
offered by Salesin et al. [372]. Hermite interpolation for radiosity has been discussed 
by Bastos et al. [31]. 


18*8 Shooting Powor 

An alternative method to solving a matrix equation (even an implicit one) is to 
directly simulate the transfer of light throughout a scene. In the terms of progressive 
refinement, we pick a shooting patch and send out rays from that patch into the 
environment. 

Shirley has suggested that the algorithm is simplest if each ray carries power 4> 
rather than radiance L [398] (recall Equation 18.25). 

Such a method requires choosing a number of rays to shoot and a pattern in 
which to shoot them. Shirley has shown that if we distribute the rays uniformly, 
then to get the variance of the radiance estimate below some threshold requires only 
O(N) rays, where N is the number of patches [397]. 

To show this, we first digress for a moment to summarize some probability that 
will prove useful (for more information on these terms, see Appendix B). Suppose 
that we have a set 5 of A identically distributed random variables X*, such that each 
Xi has a value x with probability p, and is 0 otherwise. Then the expected value 
E(S) of the set is given by 


E[S) = E 


N 


5 > 


L i=i 


= NE[Xi] = Npx 


(18.96) 
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The variance vax(S) of the set is given by 


var(S) = varI ^ X, J = N var(X,) 


(18.97) 


1 = 1 


The individual variances var(Xi) may be found by direct computation: 


var(Xf) = E[xi 2 ] — E[xi] 2 = px 2 — p 2 x 2 = px 2 ( 1 — p) < px 2 


(18.98) 


Now we can proceed with our radiance argument. We would like to find out 
how many rays we need to fire from the environment in order for the variance on 
patch i, given by var(Li), to be lower than some threshold V 0 . 

We start by observing that the power reflected from patch i is simply the 
reflectivity of the patch Ri times its total incident power. Suppose that we fire a 
total of r rays into the environment. We will assume that each ray carries the same 
amount of power; if the total power to be shot is $*, then each ray carries &/r . 

The power <I>f carried by ray number k to patch i will either be 3></r if the ray 
makes it to patch i, or else zero. 


r 



(18.99) 


k= 1 


We will assume that the probability of a ray striking patch i is p\. Because all the rays 
are generated with the same distribution, this probability is constant for each ray. 
Then using Equation 18.96, we have N = r rays, a probability p = pi of intersection, 
and a value x = $ l /r delivered to the patch upon intersection, giving an expected 
value power £?[$*] of 



(18.100) 


and an expected radiance E[L t ] of 



(18.101) 


Similarly, we can find the variance in the power and the radiance from 



(18.102) 







978 


18 RADIOSITY 


Our goal is to bring var (Li) < Vo, so we might be tempted to use as large an r 
as possible; that is, we can drive down the variance by firing a lot of rays. This is 
reasonable, but not the result we would like. We could also drive the area down, but 
var(Li) in Equation 18.102 blows up as the area A{ gets smaller. Instead, suppose 
that all areas are bounded by a range [A min , A m8LX \- Writing A as the total area in 
the scene, A = JZili A^ then the average area A a is given by A a = A/N . For some 
constant K , 


ir< K 

A k 


(18.103) 


for all choices of i and fc. Equivalently, Ai < KA a = KA/N. 

Suppose that the probability of a ray striking a surface with area Ai is given by 
p u < 1 (this is violated only if the area completely encloses the origin of the ray, since 
in that situation p u = 1; if the surface is convex, then strict inequality holds). Then 
using this as our probability for E[Li\ in Equation 18.101, 


-P u *t < L n 


(18.104) 


for some maximum radiance L max . Then solving for this probability, 

» <? nAi T 
P ~ RiQt Lmax 


(18.105) 


A u $ t 2 _ Ri nAi 
Var(Lj) p — ^max 

nAi r nAi Ri$ t 


-^max- (18.106) 


Assuming that the reflectivities R satisfy 0 < Ri < R m ax = 1, and recalling 
Ai < AK/N , we can write the variance as 


Var<t|) S ,{AK/N) P T 


Rm**N u <V 

nAK ^ r 


or equivalently, 


for a constant C defined by 


N 

var(Lj) < C — 
r 


RmaxP U $t 


(18.107) 


(18.108) 


(18.109) 


To set var(Lj ) < Vo, we have < Vo, or 


(18.110) 



18.9 Extensions to Classical Radiosity 


979 


This is the result we sought; it says that to get the variance in the radiance below 
some threshold Vo, we only need a constant number of rays, given by the product of 
a constant C/Vo with TV, the number of patches. Note that the constant is inversely 
proportional to the desired variance, so as we are willing to tolerate more errors we 
need fewer rays. Shirley has also shown that a similar analysis can be carried out for 
further levels of interreflection [397]. Kok has noted that groups of patches may be 
clustered for the purposes of shooting, lowering the required number of power rays 
even more [249]. 

So shooting power directly from the patches into the environment is a compu¬ 
tational alternative to the matrix formulation that has reasonable computational 
requirements. 


18*9 Extensions to Classical Radiosity 

The hierarchical and importance-driven radiosity solutions described above are both 
based on improving the efficiency of the classical radiosity model. At least two of the 
assumptions behind this model can be relaxed: the limitation to diffuse reflectors, 
and the limitation to nonparticipating media. 

The classical radiosity method may be extended to nondiffuse surfaces by defining 
other types of form factors. For example, suppose that we have a scene of a room 
that contains a single flat mirror on a wall. From a point within the room, looking 
into the mirror is like looking through a window into another, identical room, as 
shown in Figure 18.72. 

Suppose we have two surfaces, Pi and P*, in the room and we want to compute 
the form factor from Pi to P^. In the figure we show two ways light can travel this 
path: directly from one surface to the next, and via a specular reflection off of the 
mirror. By constructing an image of the room on the opposite side of the mirror; we 
can account for this second transfer directly using traditional form factor algorithms, 
and we can avoid any direct consideration of the patch represented by the mirror 
[370]. 

Min-Zhi Shao et al. noticed that this approach to specular surfaces, besides its 
limitation to flat patches, has an exponential growth with respect to the number of 
mirrors in the scene [392]. They suggested instead constructing form factors between 
two patches, which can include light specularly reflected from one to the other. Ping- 
Ping Shao et al. suggested a different form of specular form factor, called multipoint 
form factors or extended form factors , representing the three-point transport of 
energy from one patch to another by way of an intermediate patch [393]. 

The basic idea is that we can write a form factor such as Fp k ,p ly p m , which relates 
the energy transferred from P* to P m via reflection at Pi. Since the BDF at P/ is 
known, we can use the known relative geometry of P* and P m with respect to P/ 
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(a) A room with a mirror on the wall, (b) An equivalent scene with an image of the room and a 
hole instead of a mirror. 


to determine how much energy would be propagated by such a reflection using a 
general BDF at Pi. 

Another way of tracking specular effects is to discretize the outgoing radiance 
distribution from a point [224]. Rather than simply use a single radiosity value to 
represent the energy radiated equally in all directions, we can place a global cube 
around a point. Like a hemicube, the global cube faces are subdivided into grids. 
The light exiting each grid cell may be stored to represent a nonuniform distribution. 
This approach requires massive amounts of memory to save all the cubes in the 
scene. It also suffers from aliasing artifacts due to the regular spacing of cells on the 
cube faces. 
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A nonuniform propagation function may also be efficiently stored using spherical 
harmonics [408]. This has the advantage that no a priori discretization is required, 
and memory is conserved. It also allows the propagation function to be evaluated 
from a continuous description at any point, rather than interpolated from stored 
grid elements. 

The classical radiosity formulation assumes that the medium between surfaces 
is a vacuum. If this requirement is relaxed, then we move into the realm of zonal 
methods , developed in the heat transfer literature. Zonal methods were introduced 
to graphics by Rushmeier and Torrance [369]. The basic idea is that in addition to 
the surface-to-surface form factors we have concentrated on in this chapter, we also 
develop surface-to-volume and volume-to-volume form factors. A much larger set of 
simultaneous equations may then be constructed that relates all of these form factors 
simultaneously. An example image computed in this way is shown in Figure 18.73 
(color plate). An overview of volume methods for radiosity is presented by Rushmeier 
[367]. 

Radiosity has been extended to account for furry surfaces [84] and bump-mapped 
surfaces [83]. 

Radiosity simulations are closely tied to the geometry of the scene; the form 
factors are innately geometric and depend on the mutual visibility of points in the 
environment. Typically if the objects move in a scene, the radiosity solution must 
be recomputed. However, it can be computationally efficient to update a radiosity 
solution (rather than freshly recompute it) if only a few objects in the scene move. 
The basic idea is that only the form factors between surfaces whose relative visibility 
has changed require recomputation. The update takes place in three stages. First, 
negative radiosity is shot between the affected patches; this removes the effect of 
their interaction. Second, the objects in the scene are moved to their new positions. 
Third, normal positive radiosity is balanced between the affected pairs of patches. 
This approach is described by George et al. [152] and Chen [85]. 

Because classical radiosity algorithms produce 0(N 2 ) form factors for N patches, 
it is desirable to keep the number of patches as small as possible. Even HR produces 
an 0(N 2 ) set of form factors, but typically we start with far fewer surfaces than 
in classical radiosity because they need not be premeshed. One way to simplify the 
problem is to break it up into two smaller problems. Xu et al. have observed that 
if we are computing a simulation of two rooms joined by a small doorway, as in 
Figure 18.74, then we can approximate the transfer through the door as though it 
was a single polygon [492]. 

First, solve the right-hand room on its own, as though the door was a perfectly 
absorbing polygon. Then solve the left-hand room, treating the door as a radiator 
of the light it absorbed in the first pass. Record the light falling on the door in 
this solution, and then return to the right-hand room. Iterating this procedure will 
eventually converge on an approximate solution. This method is attractive because 
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1 



PI8IIRI 18.74 

Two rooms joined by a door. 


the cost of performing several solutions on N/2 polygons is cheaper than performing 
one solution on N polygons. 

Related approaches have been described by Neumann and Neumann [317] and 
Rushmeier et al. [366]. They simplify the model to create a radiosity solution on 
a low-complexity database. This is similar to the multigridding approach used in 
adaptive hierarchical radiosity. An excellent introduction to multigrid methods is 
available in the tutorial by Briggs [63]. 


18.10 Further Reading 

This chapter has only surveyed some of the larger issues in radiosity. As a relatively 
new algorithm of great practical value, radiosity is a subject of intense active research, 
and there are plenty of important practical issues that should be considered if you 
are planning to write a radiosity system. 

The best places to go for more information are the excellent recent books by 
Cohen and Wallace [99] and Sillion and Puech [409]. Each offers plenty of theoretical 
analysis, practical advice, and a substantial bibliography. An extensive analysis of 
radiosity in the world of two dimensions has been carried out by Heckbert [202]. A 
nice short retrospective survey has been written by Wallace [459]. 

The basic ideas behind radiosity have been used in the field of heat transfer for 
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years. The classic texts in that field are the books by Sparrow and Cess [417] and 
Siegel and Howell [406]; they both offer a wealth of information on topics relevant 
to radiosity. 

In addition to the matrix solution discussion offered by Gortler and Cohen [166], 
Greiner et al. discuss a variety of methods for reducing the time required to solve the 
radiosity problem [172], and Shao and Badler offer a survey and comparison [391]. 

Linear basis functions have been studied by a variety of researchers. Max and 
Allison have explored the use of linear, tent-shaped basis functions that have a 
height of 1 at a given vertex and fall off linearly to 0 at all other vertices; they have 
developed an efficient algorithm for computing images with such functions using 
the linear interpolation hardware in real-time graphics rendering machines [288], A 
similar approach using linear basis functions was described by Bao and Peng [29], 
who approximated curved surfaces with a triangular polygonal mesh for the storage 
of radiosity. The radiosity value at any point within the mesh could be derived from 
linear interpolation of the vertex radiosities. Bian et al. used linear basis functions 
over quadrilaterals based on bilinear interpolation among the vertices [44]. The use 
of Galerkin bases was developed by Heckbert for the special case of 2D radiosity 
[202]. Discussions of the Galerkin solution are offered by Heckbert [202], Troutman 
and Max [440], and Zatz [503]. 

The hemicube technique relies on efficient scan-conversion for identifying the cells 
occupied by each patch in the environment. This scan-conversion may be accelerated 
with the techniques of Greene et al. [169] and Teller and Hanrahan [435]. Both of 
these papers provide good bibliographies covering related work. Finding the best 
distribution of cells on the hemicube face has been studied by Max and Troutman 
[285]. 

Surveys of form factor methods for graphics are available in the paper by Pueyo 
[350] and in the books by Cohen and Wallace and Sillion and Puech mentioned above. 
Some form factor algorithms that combine several simpler approaches are explored 
by Pietrek [341]. The heat transfer literature contains a rich body of material on form 
factors and their computation. The survey by Walton [462] compares form factor 
calculations in terms of their utility to that community. Emery et al. have compared 
a number of algorithms in terms of computational efficiency and accuracy; their 
findings are described in [136]. Their conclusion was that when there was enough 
computational power to justify the expense, Monte Carlo techniques proved the best 
method for estimating the form factor in general. 

The discussions of hierarchical radiosity and importance-driven refinement left 
out many details that are important in a practical system. An early form of hierar¬ 
chical radiosity was the two-tier approach due to Cohen et al. [97]. Implementors 
are urged to review the original hierarchical refinement paper by Hanrahan et al. 
[192] and the wavelet radiosity articles by Gortler et al. [167] and Schroder et al. 
[384]. Importance-driven refinement was introduced by Smits et al. [414]; recent 
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extensions have been described by Aupperle and Hanrahan [24] and Christensen et 
al. [89]. 

There is a quickly growing body of literature on meshing for radiosity; much 
of this growth is being fueled by incorporation of results from finite elements. A 
good survey of meshing techniques is given by Cohen and Wallace [98]. More recent 
developments in discontinuity meshing are offered by Asensio [18], Baum et al. 
[32], Campbell and Fussell [73], Heckbert [202,210], Lischinski et al. [274,275], 
and Sillion [407]. In particular, Lischinski et al. [274] describe the combination 
of discontinuity meshing with hierarchical radiosity. Rather than mesh a priori, 
we might attempt to generate a good mesh and then move the mesh points into a 
better position to adapt to the local illumination; Aguas and Muller describe such 
an approach [4]. 

These papers also offer pointers into the extensive finite elements and computa¬ 
tional geometry literature on meshing. A good starting point for this literature is 
Ho-Le [212]. 

Form factors in specular environments have been examined by Eckert and Spar¬ 
row for heat and mass transfer [134]; these ideas were applied to radiosity by 
Rushmeier and Torrance [370]. Extended form factors have also been investigated 
by Aupperle and Hanrahan [23], Bao and Peng [29], Bouatouch and Tellier [56], 
Bouville et al. [57], Chen et al. [86], Chen and Wu [82], Hall and Rushmeier [180], 
Kok et al. [250], Le Saec and Schlick [259], Shirley [396], Sillion et al. [408,410], 
and Wallace et al. [460]. In particular, Aupperle and Hanrahan [23] have combined 
a three-point transport formulation with hierarchical radiosity. 

Some hardware and multiprocessor implementations of radiosity algorithms are 
described by Vilaplana and Pueyo [454,455], Varshney and Prins [452], Bu and 
Deprettere [67], Baum and Winget [34], Drucker and Schroder [128], Drettakis et 
al. [127], Puech et al. [349], and Purgathofer and Zeiller [352]. 


18*11 Exercises 

IxotcIm 18.1 

Suppose that a hemicube is placed over a point with top face n x n and side faces 
n x n/2. How many delta form factors do you need to store? What are they? 

IxmvIm 18.8 

In Table 18.1, when we increased the reflectivity of patch A from 0 to 1/10 in the 
first two lines, the sum of the radiosities increased. Does this mean that there is more 
power in the environment? Are we getting something for nothing? 

iKtrclit 18*3 

Consider Figure 18.75 showing an infinite rectangular tube of dimensions a x b. Write 
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the eight form factors for this system (use the form factors in Equation 18.30). Write 
a computer program to evaluate the form factor matrix for different reflectivity and 
emission values. Determine the radiosities for the conditions in Table 18.3. Interpret 
your results. 

hMtist 18.4 

Review the paper by Baum et al. on meshing for radiosity [32]. Can you describe 
any other problem cases that they did not cover? How hard would it be to write a 
program that includes both their observations and discontinuity meshing? What do 
you think would happen to the number of polygons in the system? Is there a way to 
control the number of polygons? 

hercitt 18.5 

One problem with hierarchical radiosity is that it starts with large patches and refines 
them, while sometimes we are given a database consisting of a large number of small 
polygons. Can you suggest methods for clustering these polygons into larger pieces 
appropriate for refinement with HR? 

ExercU* 18*6 

Consider two spheres with centers A and £, which are each of radius 1 meter and 
2 kg in mass, separated by a distance of 5 meters, as in Figure 18.76. The spheres 
are joined by the line AB with midpoint C. Measured along a line through C and 
perpendicular to AB , how far away would you have to go in placing a 6-kg mass at 
T in order for the gravitational force due to the pair to be indistinguishable from a 
single mass of 4 kg located at C? Assume three digits of precision. The gravitational 
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Test number 

Reflectivity 

[pAiPBiPCiPd ] 1 

Emissivity 

[Ea, Eb, Ec, Ed \ 1 

a 

(1,0,0,0) 

(0,1/2,1/2,1/2) 

b 

(1,0,0,0) 

(1/2,1/2,1/2,1/2] 

c 

[0,1,0,0) 

(0,1,0,0) 

d 

[1,1,0,0) 

(0,0,1,1] 

e 

[1,0,1,0] 

(3/4,3/4,3/4,3/4) 

f 
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Conditions for Exercise 18.3. 



MOUKI 13.76 

Two spheres. 


force due to an object with mass mi at point P experienced by a mass m 2 at point 
Q at a distance r = |P - Q\ is given by 

F = G —<2 = 6.6720 x 10~ n N • m 2 /kg 2 (18.111) 

in the direction of P, where G is a universal constant for all pairs of particles, and 
N is the force in newtons. 





At the bottom of Mount I 3 a monk hud built a 
hermitage, and Kyozan went there and told him 
what Jsan had said , namely: “Most people have 
the great potentiality, hut not the great 
function . ” The monk told Kyozan to ask him 
concerning the matter ; but when Kyozan was 
about to do so f the monk kicked him in the 
chest and knocked him down. Kyozan went 
hack to Isan and told him , whereupon Isan 
gave a great laugh. 

R, H. Blyth 

( 44 Zen and Zen Classics/ 1 1978) 
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19.1 Introduction 

In Chapter 18 we discussed methods for explicitly constructing a distribution of 
light in an environment. An alternative solution returns to the original radiance 
equation and uses Monte Carlo point-sampling techniques to estimate the radiance 
at particular points (p,d;) in phase space. The approach is generally driven by 
the desire to find function values over the viewing surface, which in turn requires 
the generation of radiance values within the environment. The sampling of the 
viewing plane generally proceeds from the identification of points on the plane, and 
directions that can influence those points. The goal is to estimate the irradiance 
signal around each of those points within the necessary solid angle. We know from 
the radiance law and our construction of the radiance equation that we can find the 
radiance arriving at a point from a given direction by finding the radiance leaving 
that surface point that is visible from the shading point in that direction along with 
the volumetric effects adding and removing light along the way. To find this visible 
point, we typically use a body of techniques known as ray tracing. 

In one sense the ray-tracing method is nothing more than an application of the 
Monte Carlo methods from Unit II to the full radiance equation (or any of its 
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variants in Chapter 17). Theoretically, we can sample the radiance function directly, 
and as long as bias is avoided (or accounted for), then we can derive an estimator 
for the function. In practice, however, the costs of ray tracing and the complexity 
of the function make direct evaluation prohibitively expensive. A large variety of 
techniques have been developed to explore methods to accelerate the process. 

This acceleration has been focused on finding the point of first intersection be¬ 
tween a ray and the environment. This is because the ray-tracing process executes 
this operation many millions of times per image; each execution should be as efficient 
as possible. 


19.2 Photon and Visibility fracing 

The ray-tracing approach is deeply entrenched in classical, geometrical optics: we 
assume that all objects are much larger than the wavelength of light, and that light 
travels in straight lines (relaxing the first condition leads to refraction and diffraction, 
and relaxing the second allows relativistic effects). Suppose that we are simulating a 
scene composed of two opaque patches, P\ and P 2 , viewed from an eye position P, 
and that there is a single small (but finite) light source L with uniform illumination in 
the scene, as in Figure 19.1(a). We suppose that we can see P\ from E as indicated, 
and we want to find the light reflected back to the eye. 

By examination of the figure, we can see that in addition to any light that Pi 
emits on its own, it can only reflect light coming directly from L. The mechanics 
of the reflection are described completely by the BRDF at Pi, so we need only find 
the illumination from L. In general, there are two ways to go about finding this 
illumination. 

The first method, illustrated in Figure 19.1(b), is called photon tracing . The 
general idea is that we generate a large number of photons radiated from L and 
follow them into the scene. Some fraction will strike Pi, and that will represent an 
estimate of the incident illumination on P\ for the purposes of applying a shading 
model. 

An alternative method is called visibility tracing , illustrated in Figure 19.1(c). The 
idea here is to look around the shading point and try to find the radiance value at 
every surface point that can contribute illumination. We do this by sending rays into 
the environment and determining which object is seen by which ray; the radiance sent 
from that object toward the shading point contributes to the overall illumination. 

We will now look at these two approaches. Because it is a more widely used 
technique, we will start with visibility tracing. 
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FIOIIRI 19.1 

(a) A simple scene, (b) Sending light from the source, (c) Seeking light from a patch. 
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19.3 Visibility fracing 


In visibility tracing we build up a chain of light-object interactions in reverse, starting 
with those at the image surface and eventually working our way back to the light 
sources. Figure 19.2 shows a typical result in a simple scene. In Figure 19.2(a) we 
show the geometric history of a ray from the eye as it strikes objects and then creates 
new rays, seeking the illumination at those intersections. In Figure 19.2(b) we have 
abstracted away the geometry and show just the tree of rays. 

For visibility tracing in a vacuum, we use the OVTIGRE form of the radiance 
equation from Equation 17.16: 

L(y,Q°) = I/ C (s,,d5°) 4- [ f(r,u^uj°,X)L(r,uj)cos6 r doj (19.1) 

In image synthesis, the two techniques that have proven most useful to aid in a 
Monte Carlo evaluation of this integral are stratification and importance sampling. 
So our first step will be to tile the input domain 0[ of Equation 19.1 into s individual 
strata Dk (recall that these are nonoverlapping subdomains that together match the 
input domain): 

s 

6' = (J D k (19.2) 

k= 1 

Within each stratum k we apply importance sampling , which means multiplying by 
a pdf gk and then dividing by that pdf so we don’t introduce bias. The result is 


L(r,d5°) = L e (s,u°) + f f(r,uj^uj°,\)L(r,uj)cos6 r dv 

= L e (s,u°) + i2 f /(r, <3 <3°, A) L(r ’ f * r g k (r, <2) dw (19,3) 

9k{r,v) 


The stratification of the set of incident directions 0 • on the sphere around s in¬ 
duces a stratification on the set of all surfaces M. To see this, consider Figure 19.3(a), 
where two strata have been isolated. If we build a cone defined on each stratum with 
its apex at the origin, then those cones sweep out into the environment, as shown 
in Figure 19.3(b), and intersect objects. The cone stratifies all points in the environ¬ 
ment into two sets: those inside the cone (or on its surface), and those outside the 
cone. Each time a cone passes through a surface, it divides the surface into those two 
classes. In other words, the cones induce a stratification of each surface, as shown 
in Figure 19.3(c). 

There is a one-to-one correspondence between strata on the incident hemisphere 
0J(s) and the strata on the surfaces M*. It is important to note that the cones 
in Figure 19.3(b) can be defined using only the apex point s and a cross section. 
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MOURI 19.2 

(a) Using visibility ray tracing in a scene, (b) The tree corresponding to the rays in (a). 
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(a) Two strata on a direction hemisphere, (b) The projection of those strata into the environment, 
(c) The induced strata on the surfaces. 


Whether this cross section is on the hemisphere or on one of the object surfaces 
doesn’t matter. Strata on the hemisphere induce strata on surfaces, and vice versa. 

This observation is very important, because it makes explicit the fact that the 
domains Di in Equation 19.3 can refer to either sets of directions around s, or sets 
of points on the surfaces M. 

The general procedure in visibility ray tracing is to choose some strata in each 
domain, resolve them so they don’t overlap, and then sample the strata. That is, to 
find the illumination at a point, we choose some directions and some surfaces from 
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which we want to gather light. The directions are usually chosen on the basis of 
the geometry of the shading situation and the object’s BDF, while the surfaces are 
usually chosen on the basis of how much light they emit and propagate. 

For example, suppose that we are viewing a shiny surface from a particular 
angle. We expect that light coming in from near the specularly reflected direction 
will be important, so we might densely stratify the set of directions near the reflected 
direction. And if there are some bright light sources in the scene, we will probably 
want to make sure we pay some attention to them, so we stratify the surfaces of those 
sources into several domains. In other words, the subdivision of the direction set 
creates strata through which we push visibility to search the world, and subdivision 
of the surfaces creates strata which pull visibility information from the shading point 
toward the surface. 

To make this distinction formally, we will introduce some notation to distinguish 
different sets of points and directions. 


19.3.1 Strata Sots 

In this section we will introduce four different types of sets: two sets of directions 
and two sets of surface points. We will call these different collections strata sets since 
each one is based on elements of a particular stratum. 

For convenience we will write a point p as some point along the ray t o coming 
into s: 

p = s — ao. q>0 (19.4) 

Notice that a > 0 enforces the condition that this is a point that is seen from s by 
looking backward along the incident vector Co. Similarly, we define AT(s, u5) to be the 
point p with the smallest such a that generates a point on M, as in Figure 19.4. We 
can use the ray-tracing (or visibility) function v(r,Co) defined in Equation 12.94 to 
define this value: 

N(s , Co) = s — [inf {a > 0 : s — aCo € M }] Co 

= s — v(s, Co)Co (19.5) 

We will relate points in space to directions around s. The directions always come 
from the set 0-(s), so when we say a point is “visible” from s, we mean that we can 
find a value a > 0 that will fit the definition for p. In other words, if we can look 
backward along Co to a point p, we say p is visible to s along Co. 

The first strata sets we will consider are the two direction-driven sets. Suppose 
we are given a solid angle IV Then Pi = P(Ti) is the set of all surface points that 
can be reached by a ray passing through that solid angle, as shown in Figure 19.5. 
We define this set as 

p t = P(T t ) = {p:peM,uer i } 


(19.6) 
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MOURI 19.4 

Determining the first point intersected by a ray. 



MOURI 19.9 

P(Ti) is the set of all surface points within I\. 









FIOURI 19.6 

P(fi) is the set of all visible surface points within r\. 


Note that the points need not be visible from s to be in This solid angle induces 
strata on all the surfaces in the scene which are at least partially within it. 

A closely related strata set contains only those points in the set Pi that are directly 
visible from s; that is, there is no other surface between that point and s. We write 
this set Pi, and define it: 

Pi = P(Ti) = {p : p = N(s,w),w € r<} (19.7) 

Notice that the only difference between Pi and Pi is that the latter includes only 
points AT(s,£), not all points on M. The set Pi is illustrated in Figure 19.6. 

Notice that the sets Pi are not necessarily mutually exclusive, but the sets Pi are. 
We can express this symbolically by saying that the empty set 0 is always the result 
of intersecting any two Pi, but not necessarily any two Pi. For any two i ^ k. 


0 c Pi n P* 
0 = Pi n P k 


(19.8) 
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U(Gi) is the set of all directions leading to a point in Gi. 


The sets Pi and Pi defined surface strata in terms of directional strata. The next 
two sets propagate information in the opposite direction. 

Consider a set of points Gi defined over some set of surfaces. The set of all 
directions in which these points are visible is the set 11* = II(G*). We define this 
point-driven set as 

n i = n(G t ) = {^:p€G i } (19.9) 

This definition is illustrated in Figure 19.7. For clarity in the figure, the set Gi is 
shown on only one surface, though in general it may extend over several surfaces. 
Taking the natural boundaries of surfaces as the edges of strata, we can always break 
down Gi into set of smaller point sets that are equivalent to Gi , yet each point set is 
contained on a single surface. 

Not all the elements of Gi are necessarily directly visible to s, due to intervening 
objects that may block visibility. The set 11* = II(G*) is the set of directions that see 
elements of Gi directly, as shown in Figure 19.8. We define 11* as 

fii = fi(Gi) = {u : N(s,u) € G,} (19.10) 

As with the point sets, the intersection of two different n* is not necessarily empty, 
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II (G,) is the set of all directions that can directly see a point in G*. 


though the intersection of two different 11* will be. In general, for any i / k. 


0 c n* n n k 

0 = fit n n fc (19.11) 

These sets allow us to construct strata either on surfaces in space or on the 
directions around a point, and make sure that we’re not counting anything twice. 
The general plan is to stratify the environment into strata G* and simultaneously 
stratify the directions into strata T*. All of the T* are projected into space to further 
stratify the surfaces, and all the G* are projected onto the direction sphere to further 
stratify it. The result is a single, unified set of strata that are consistent in both 
domains. We call this resolution of the strata. At the end of a resolution step, there 
are as many surface strata as there are solid angle strata. 

Figure 19.9 shows an example of this operation. The solid angle T* on the 
direction sphere includes a circular stratum on each surface, and their edges induce 
strata on the direction sphere and each other. Figure 19.9(b) shows the view from 
the point s; the circle and the three rectangles together create eleven different regions 
in the scene. The stratification on the solid angle and each surface is shown in 
Figure 19.9(c). The solid angle itself subdivides into five solid angles, and the 
nearest-to-farthest surfaces are divided into seven, four, and nine regions. 






PIOURI 19.9 


(a) A solid angle and some intersected objects, (b) The view from the hemisphere, (c) Resolved 
strata on the hemisphere and each patch. 
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Visibility-resolved strata. 


Recalling Equation 19.3, which wrote the integration over Q\ in terms of smaller 
domains D each corresponding to a T t , we can now relate a particular region on a 
particular surface with each solid angle IV 

If we were to actually compute the combination of each surface stratum (including 
edges) with each solid angle stratum, the result would be a combinatorial explosion. 
But we don’t actually need each individual stratum, since we are really only interested 
in the nearest-visible surfaces. Figure 19.10 shows that in our example, each surface 
has only one visible stratum, and the solid angle has only three strata. We call this 
much smaller set of strata the visible resolution of the strata. 


19.3.2 Applying Resolved Strata 

The heart of the resolved strata method is that each stratum may be represented 
by either its solid angle representation or its surface representation. Recalling our 
stratified form of OVTIGRE given in Equation 19.3, we have our choice of writing 
each integration over a domain over either the solid angle or the surface associated 
with it. 

The only remaining step is to find the appropriate integration expression repre¬ 
senting the light due to each type of stratum. We will derive these in the same order 
in which they were presented above. 

A stratum P{ is a point set of all points on all objects visible through a solid angle 
IV We need only integrate each direction over T* and apply a visibility test at each 
point. If the point is visible, the visibility test returns 1 and the light is added in to 
the incident light in that direction; if the test is 0, that light is ignored. Only one 
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Points A and B and a direction Q. 


point for each Q £ T, will have a visibility value of 1. The integral is then 



p( s,uj —> u;°)I/(p,o;)V r (p, s) cos 0dp 


(19.12) 


The expression for P* is similar, but because Pi already only contains those points 
that are directly visible, we can drop the visibility function: 

1=1 ^ -* Lj°)L(p,Lj)cosOdp (19.13) 

Jd x JpeP t 

Now we turn to the direction sets. Things are slightly trickier here than for the 
point sets. A surface point can be reached from another point s from only a single 
direction, but a ray from s in that direction may intersect the surface several times. 
For example, consider Figure 19.11. The point labeled A on the sphere has only a 
single direction uj associated with it. But a ray in that direction may intersect the 
sphere not only at A, but at B as well. In general, the number of times a ray may 
intersect a surface is proportional to the order of the surface: a plane has order 1, a 
sphere has order 2, a cubic patch has order 3, and so on. 

When we integrate over a direction set n*, we need to identify every intersection 
of a ray u € FI; with every one of the N objects in the environment. If object m has 
degree d m , then we can find the point p m ,„ corresponding to the nth intersection 
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with object ra. Multiplying the radiance there by the visibility function as before 
gives us the contribution of the radiance leaving that point to s: 



r N d„ 


p{ s, ijj —y 


5]^L(p m , n ,a3)K(p,s) 


■m = l n=1 


cos 6 dp 


(19.14) 


Finally we come to the strata represented by IIj. This is a set of directions that is 
guaranteed to lie within one stratum on one surface; we need only find that surface 
and find the radiance there. So 

[ = f ^ p(s,d; -y <jj°)L(N(s,u}),Q) cosOduj (19.15) 

J D ( J u^Grii 

Summarizing the above discussion, there are four ways we can integrate the 
incident radiance falling on a point 3 through a given directional stratum. The 
surface-based methods integrate over all points in the corresponding surface stratum, 
applying a visibility term if necessary. The direction-based methods integrate over 
all directions in the stratum, finding the corresponding surface points and using their 
radiance. So for each directional domain Di in Equation 19.3 (repeated here for 
reference): 


L(r,u°) = L e (s,iu°) + T [ /(r ,Q -> <3°, A) L(r ’f } * r g k (r, £3) dw (19.16) 

tT[JD k 9k( r,a>) 

we have our choice of four equivalent methods, summarized in Equation 19.17. 


L- 


/ p( s,d; —y 3 0 )L(p,uj)V(p,s)cos6dp 

JpePi 

I p( s, uj —y 3°)L(p, 3) cos 0 dp 

JpePi 


N drr 


C r u m 

I p{ s, oj —y 5Z 5I- f '(Pm,n,^)V(p,s) 

L m =l n=1 


(19.17) 


QosOdui 


/ _ p(s,3 -y 3°)L(N(s,3) 1 3)cos0d3 
JzeTii 


The general procedure for evaluating the propagated radiance at a point may be 
summarized in six steps: 

1 Choose Ti based on information at s. 
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2 Choose Gi based on information on surfaces. 

3 Compute the visibility resolution of the strata. 

4 For each stratum, choose an integration method. 

5 Integrate each stratum. 

6 Test each stratum and adaptively subdivide if necessary, returning to step 1. 

Most ray-tracing methods published in the literature may be viewed as approx¬ 
imations to this six-step procedure, which either avoid, combine, or approximate 
different steps. We will review some of these methods below. 

An example of this process is shown in Figure 19.12. 

Our common departure point is the schematic visibility tracing diagram of Fig¬ 
ure 19.13. The diagram shows some patches and solid angles; each has been assigned 
an arbitrary label for the purposes of discussion. To begin with, suppose we want 
to find the energy carried back to the apex of solid angle Ti. This solid angle has a 
corresponding domain on patch M x . To find the light radiated by Mi back into Ti, 
we need to combine the self-emission of M\ with the light propagated from there. 
To find the propagated light from Mi, we need to know the light falling upon it. 
Suppose we find the light leaving the amount of M\ within T\ by integrating over 
every such point. The figure shows one such point. 

To compute the illumination on a point within Mi, we can stratify the set of 
directions around that point, stratify the surfaces in the environment, and resolve 
the strata. We have labeled by T 2 and T 3 two of the solid angles round the point on 
Mi; each of them strikes a surface. To find the light leaving each of those surfaces, 
we stratify their direction sets and the environment, resolve the strata, and then 
integrate, illustrated by the solid angles T 4 through Tj in the figure. 

This idea of recursive visibility for computing the radiance at a point was intro¬ 
duced by Whitted [477]. The strength of the method lies in the close coupling of 
visibility and illumination; a single data structure (the tree of intersections for a ray) 
can be used to carry both types of information at once. Notice that the stratification 
of both the direction sets and the environment will in general differ from point to 
point, even for nearby neighbors on the same patch. 


19.3.3 Direct ami ladlrect Illumination 

One way to compute the illumination described by Figure 19.13 is to compute some 
directional and spatial strata, but not resolve the two sets. This can lead to error 
when a set of directions or a set of surface points is counted twice; we will return to 
this later. 

The most common approach to constructing strata is to distinguish between direct 
and indirect illumination. 




FIOURI 19.12 

Visibility ray tracing with different strata. 







Visibility tracing. 


Direct light at a point is that light which comes from the luminaires without any 
other interactions along the way, as in Figure 19.14. Since this light is not diminished 
by reflection or transmission, if the luminaire is bright, then it is likely to make a 
large impact on the total illumination arriving at the point. To make sure that we 
include these important sources of illumination in our integral, we determine the 
light sources in the scene and stratify them. This stratification can be trivial (a single 
stratum over the source), finely subdivided (e.g., so each stratum carries an equal 
amount of energy), or aggregate (many luminaires combined into one stratum). 

Indirect light accounts for all illumination that does not come directly from a 





19.3 Visibility Tracing 


1 005 



(a) Direct illumination, (b) Indirect illumination. 


luminaire. To model indirect light efficiently, we need to use information about 
the surface, as well as the environment and the distribution of light within it. For 
example, a shiny surface will strongly reflect and transmit light by specular reflection, 
so it is important that we gather a good estimate of light that arrives from the specular 
directions. This means that to determine the incident light that will significantly 
contribute to the reflected (or transmitted) light in a particular direction, we place 
fine strata in the specular solid angles computed with respect to that direction. 
We can also use any available information about nearby surfaces that are likely to 
propagate significant quantities of light onto the shading point. For example, we 
may have determined the energy leaving some nearby surfaces in nearby directions 
at a previous step; that information can help us identify those surfaces as potential 






1 006 


19 RAY TRACING 



PIOIIRI 1 9.1 S 

A simple case of indirect and direct strata. 


sources of significant illumination. Then we stratify those surfaces in order to make 
sure we get their illumination. 

Figure 19.15 shows both of these operations in a schematic view. A circular solid 
angle has been subdivided into four wedge-shaped strata in a specular direction, and 
a light source overhead has been subdivided into four rectangular strata. If this was 
where we stopped, then we would estimate the incident illumination by a sum over 
these eight strata. We would probably use some ad hoc measure (like ambient light, 
discussed in Chapter 15) to account for the other light not explicitly sampled. 

This distinction between direct and indirect illumination is simply a computa¬ 
tional convenience; as far as the shading point is concerned, it doesn’t really matter 
whether incident light arrives directly from a luminaire or via propagation by another 
surface. As long as we resolve the strata before integration, there’s no problem. 

However; if we don’t resolve visibility, then the same region of surface can be 
accounted for twice: once in a directional stratum and once in a surface stratum. 
Figure 19.16 shows the same example as in Figure 19.15, only this time the reflected 
strata overlap with the luminaire strata. If we simply sum together the integral over 
each stratum, then the luminaire will be accounted for in both places, causing an 
error. 
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Overlap of direct and indirect strata. 


If we don’t wish to resolve visibility before integration, then we can do it dynami¬ 
cally by checking for duplicated regions in pairs of strata. Any region that appears in 
two places is removed from one. The choice of region may be arbitrary or influenced 
by the specific algorithm being used. 

Most published algorithms generate strata on the fly using a combination of 
direction-based and surface-based heuristics, and then resolve those strata on the fly 
simply by checking the surface corresponding to each indirect stratum. If the surface 
is on a luminaire, then that information is either discarded, or transferred and saved 
with the appropriate stratum. That way, only indirect surfaces contribute to strata 
intended to capture indirect illumination. 

In all of the examples we will see below, the estimation of illumination is always 
begun with a set of surface-based strata on direct light sources and direction-based 
strata representing reflection and transmission. There is no resolution of these 
strata beforehand; generally the surface strata Gi are converted into direction strata 
Pi = II(Gi), and direction strata corresponding to specular reflections and transmis¬ 
sions are generated directly at the shading point based on the shading geometry and 
the BDF. 




A schematic form of beam tracing. 


Tracing Solid Anglos 

One method for generating and resolving strata actually projects into space the edges 
of the cones defined by the direction sets T* and point sets Gi and intersects these 
cones with the objects in the environment. 

The technique of beam tracing by Heckbert and Hanrahan [211] is illustrated 
schematically in Figure 19.17. (In this figure, and all following figures of this type, 
we draw only two direction-based strata at each shading point; other strata have 
similar forms.) Each solid angle is approximated with a polygonal cone (called a 
beam); if the environment is all polyhedra, then this solid angle is exact. The beams 
are clipped against objects in the environment as they are extended from the shading 
point. At any intersected object a new set of beams is generated to sample direct and 
indirect illumination. 




A schematic form of cone tracing. 

The method of cone tracing by Amanatides is similar but uses right circular cones 
rather than polyhedral cones [9]; it is illustrated schematically in Figure 19.18. Note 
that a noncircular solid angle may be approximated by a collection of circular cones 
for a more accurate fit. 

An advantage of beam and cone tracing is that they are able to compute surface 
strata efficiently and dynamically from directional strata. Care must still be taken 
when a direct light source occupies a stratum intended to collect indirect illumination. 

Ray Tracing 

If the strata are reduced to the size of a point, then a single ray suffices to sample 
it, and the correspondence between direction and surface strata becomes trivial: the 
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Classical ray tracing. 


point set on each surface induced by a solid angle is the single point intersected 
by the one ray in that solid angle. Similarly, there is only one ray associated with 
any surface set G*, since it contains but one point. 

This leads us to a structure introduced by Whitted [477], and shown schematically 
in Figure 19.19. This is the classical ray tracing method . 

Since all strata have been reduced to point size, then surface strata have only one 
point as well. Since direct light sources are represented by surface strata, we can 
only represent point sources. 

Figure 19.20 (color plate) shows an example of an image produced by classical 
ray tracing. Note the sharp shadows due to illumination from point sources, and 
the sharp images due to perfect reflection and refraction. 

Suppose that the strata have not been reduced to point size, but that we have 
decided to sample the integrals within each stratum using Monte Carlo methods. 
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Distribution ray tracing. 


Then we create a number of samples in each stratum, possibly in a nonuniform 
pattern, to sample the domain. Such a technique was suggested by Cook [101]. The 
method is called distribution ray tracing , and is shown schematically in Figure 19.21 
(the names distributed ray tracing and stochastic ray tracing are also used to describe 
this algorithm). 

This approach may be viewed as a direct use of Monte Carlo methods to sample 
the signal represented by incident light on the shading point. Stratification is built 
in by the BDF and the surface strata representing direct illumination, and it’s easy to 
avoid duplication of strata: if an indirect sample lands on a luminaire, either ignore 
it or use it as a direct contribution (though the stratification on the luminaire must 
be adjusted to represent this additional piece of sampling). 

Another Monte Carlo approach to evaluating the integral was discussed in Chap- 
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Path tracing. 


ter 16 as path tracing. This approach simply creates a path history for a single 
particle interacting with the environment until absorption. That is, rather than 
spawn new rays at an intersection, we simply choose a direction for the one ray to 
follow. Particles are generated and followed until the confidence in the answer is 
high enough. This method was proposed for image synthesis by Kajiya [234] and is 
illustrated in Figure 19.22. 

Path tracing can be subtle to implement because the distribution of samples needs 
to follow the desired stratification of the surfaces and directions, yet the stratification 
is different for each ray cast into the environment from the eyepoint, since the first 
surface intersection will be different. Some larger-region averaging and history 
must be maintained in order to preserve the benefits of importance sampling and 
stratification [234]. 
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Camera Models 

Before looking at the details of shading we will first look at the basic ideas for 
generating the first samples drawn by a visibility tracing algorithm: the samples that 
make up the image. 

Some sort of imaging model must be associated with the image surface that 
defines the image in object space; this limits the light that may strike the surface. 
Without such a limitation, our simulation would correspond to exposing a piece of 
photographic film in the real world by simply holding it up in the air. Light from 
all directions would reach the film, saturating it. Even if our film did not become 
overexposured, there would be no discernible image. Like our eyes, every image 
surface needs to be placed within an opaque enclosure that has a single aperture ; 
light passing through that aperture is the only light that can influence the image 
surface. 

The simplest type of aperture is a small hole. If the hole is of negligible size, then 
the imaging model is called a pinhole camera , after the physical device of the same 
name. A schematic pinhole camera is shown in Figure 19.23. We have an image 
plane located so that the normal through the center of the plane goes through the 
pinhole at point H . For every point P on the image plane, the only light that can 
contribute to P is that light arriving along the single ray P-H. This model can be 
used to approximate our own eyes, when the pupil has contracted to its smallest 
diameter. 

Most photographic equipment contains one or more lenses to offer broader con¬ 
trol over the range of light than the fixed model provided by the pinhole camera. 
The simplest camera model contains a single, thin , convex-convex lens, as shown 
in Figure 19.24. In this context, thin is a technical term that we will discuss in a 
moment. The lens is said to be convex-convex (or double-convex) because both sides 
create a convex solid when viewed from the center of the lens. The type of lens we 
will consider here is formed from the intersection of two spheres as shown. 

The lens has two focal points at equal distances in front of and behind the lens. 
As shown in Figure 19.25, these points lie on the axis a through the center point 
C of the lens, at a distance /. Light that comes in from the left parallel to a (e.g., 
light emitted by an object infinitely far away) will be focused so that all its rays pass 
through the secondary focal point F' on the right side of the lens. Similarly, light 
radiated from the primary focal point F on the left side of the lens leaves on the right 
side in parallel beams [311]. 

To make it easy to find this focus point, we assume that the lens is thin. Recall 
that the lens is made up of the intersection of two spheres of radius r\ and r 2 . The 
lens is said to be thin if the diameter d of the lens is much smaller than either radius: 
d r\ and d < r 2 . In a thick lens, the body of material that the light must pass 
through is significant. That means that we must account for refraction upon entering 
the lens, for the distance traveled, and then for refraction upon exiting. When the 
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A pinhole camera. 


lens is thin we can consider the distance traveled to be negligible; that means we can 
combine the two refractions into one, occurring in the plane through the center of 
the lens. This is called the thin lens approximation . 

To see how a thin lens focuses light, we can build a small imaging situation 
and read off the results. Conventionally we use the same labels for corresponding 
elements on both sides of the lens, distinguishing the elements on the right with a 
prime. Figure 19.26 shows the geometry for a thin lens. 

The lens has a central axis a that passes through the point C in the center of the 
lens. The primary focal point F is located on this axis at a distance / left of the 
center; and the secondary focal point F' is similarly at a distance / right of the center. 
We will suppose that there is a disk of radius y perpendicular to the axis, located M 
units to the left of C. This distance |M — C\ is called s . 
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A thin convex-convex lens formed by two spheres. 



PIOURI 19.29 

(a) Incident light parallel to a is focused at F'. (b) Light generated at F leaves parallel to a. 
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Geometry for imaging by a thin lens. 


We’re now ready to find that distance on the right side of the lens where the disk 
will be in focus; that is, if we built a real model of this situation, this is the distance 
to the right of C where the image of the disk will be sharp. To find this distance, we 
begin by selecting a point Q on the perimeter of the disk. As shown in the figure, we 
trace a line from Q parallel to the axis until it intersects the (thin) lens at T. Now 
we know by the construction of the lens that all rays coming into the lens parallel 
to the axis will be refracted to pass through the secondary focal point F\ so we 
can simply draw the line TF '; this line carries some of the light from Q . Now we 
also know that light rays leaving the lens on the right parallel to the axis must have 
passed through the primary focal point F on the left side. So we draw another ray 
from Q that passes through F, and follow it until it strikes the lens at S. According 
to the construction of the lens, this ray emerges parallel to the axis, so we draw a 
right-going line parallel to a from 5. Eventually this ray will intersect the other ray 
TF'. The intersection point is Q\ and it defines the focused image of point Q on the 
right side at a point M' from the center. The distance \C — M'\ is called s'. 

We can also trace a ray from Q through the center of the lens C\ it will intersect 
the other lines at £?', so any two of these three lines is sufficient to locate Q. This 
ray is called the chief ray [230]. So to locate Q' we can find either the intersection 
of two of these lines or the intersection of any of them with the plane perpendicular 
to the axis located at the distance s'. 

The problem now is to find s' for a lens of a given / and an object placed at a 
given s. The geometry is summarized in Figure 19.27. We have labeled the distances 
y = \T — C |, and y' = \S - C |. By convention, y is positive and y' is negative. On 
the left, we see that A QTS is similar to A FCS. Then corresponding sides have the 
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The triangles for the thin lens model. 


same ratio: 

y-y' _ 

s f 


(19.18) 


(note that we needed to use -y' rather than y'). We can make the same observation 
on the right and notice that A Q'TS is similar to AF'CT, so 



Adding these two equations, we find 

y-y' , y-y' _ -y 1 . y 

s s' i f 


(19.19) 


(19.20) 


Now we know by construction of the lens that / = f, so we can factor out that 
common term and simplify: 


(y-v') (j + ?) -<*-*'> (7) 

1 i_ _ 1 

s s' f 


(19.21) 






















The cones of light from a point Q. 

Equation 19.21 is called the thin lens formula [230]. For a lens with a given 
focal distance /, it tells us the relationship between any object at a distance s and 
the distance of its image s'. 

It’s important to observe that no matter where a source is on the left of the lens, 
we can place a screen anywhere to the right of the lens and receive light from that 
source. Figure 19.28 shows a cone of light leaving a point Q and impinging on 
the lens, and then a refracted cone leaving the lens. The apex of the cone is at the 
distance s, meaning that a sharp point of light at Q will appear as a sharp point of 
light on a plane perpendicular to a at s'. But as we move that plane along a, it slices 
the cone so the image of Q becomes a circle; Q thus appears as an out-of-focus little 
circle of light. This circle is called the circle of confusion. 

The radius of this circle can be found from Figure 19.29. The cone swept out 
by Q is as large as the lens at a distance x = 0 to the right of the lens, and it has a 
diameter of 0 at x ~ s. Since a cone is linear, these two measurements are all we 
need; if the lens has a diameter d, then the diameter c(x) of the cone at distance x is 
given by 


(19.22) 
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Computing the diameter of the cone of confusion. 


We can solve for s from the lens formula of Equation 19.21: 



and then plug this into the cone diameter: 


(19.23) 


c(x) = -xd + d (19.24) 

Equation 19.24 tells us that if there is a point at a distance s' from a lens of 
diameter d, then that point will be blurred into a circle of radius c(x) at a distance 
of x units to the right of the lens. When x = s, the circle has a radius of 0, and thus 
the object is in focus. 

Let’s rewrite the circle of confusion equation to isolate the lens diameter d: 


c(x) = d( 1 - xg) (19.25) 

where we have swept all the geometry terms into a constant g . This tells us that when 
the lens diameter is small, the growth in the size of the circle as we move away from s 
will be small. This algebra reflects the geometry of Figure 19.30(a): the lens diameter 
is small, so the cone diameters are small, and thus the circles of confusion are small. 
In the limit, the lens diameter goes to 0 and we have a pinhole, where everything is 
in perfectly sharp focus. When the diameter is large, as in Figure 19.30(b), then the 
circle of confusion grows quickly with distance from s. 
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Cones of confusion, (a) Small lens diameter, (b) Large lens diameter. 


Suppose that our image plane (like a piece of film in a camera) is at some given 
distance s from the center of the lens. Then we know that objects at distance s' will 
be in focus, and those nearer or farther will be out of focus. When the lens diameter 
is small, the amount by which objects blur as a function of their distance from s 
is small; we say that such a lens has a large depth of fields meaning that there is a 
wide range of depths in which objects are nearly focused. When the lens is large, 
the depth of field is small. The center of the field is determined by the relationship 
between the lens focal length / and the film distance s', and the depth of the field is 
determined by these factors and the diameter d of the lens. 

This is why photographers move the lens toward and away from the film to focus 
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on different depths, and change the f- stop (or aperture ), which controls the diameter, 
to adjust the depth of field. The /‘-stop numbers are set up on a typical camera so 
that each increase in setting corresponds to a diameter change that halves the area 
of the circle. The typical set of f- stop values are //1.4, //2, //2.8, //4, //5.6, //8, 
//11, //16, //22, //32, //45, and //64. Larger /"-stop values correspond to smaller 
apertures [447]. 

An algorithm for sampling the environment using the thin-lens model was pro¬ 
posed by Cook [102]. Suppose we have the geometry of Figure 19.31: the film plane 
is at a distance s' from a thin lens with focal length / and diameter d, and we want to 
find the light striking a point on the film marked P'. From our construction, we can 
find the point in the environment that would come into perfect focus at P' by tracing 
a line from P' through the center of the lens C and finding its point of intersection 
with the focal plane at a distance s on the other side of the lens. This intersection 
point is labeled P. We know that all rays that contribute to P' come to it through a 
cone which has P as an apex and the lens as a cross section. 

To find the total contribution of the environment to P', we need to integrate the 
radiance coming through that cone. We can numerically estimate the radiance by 
taking points E on the lens, and tracing rays from those points through P. Because 
the lens is thin, we can generate points on the lens by simply distributing them on a 
disk of diameter d centered at C. The rays may be written 

R = P + a{P-E) (19.26) 

for 0 < a G 72. We will call the lens points E since they are effectively the location 
of an observer’s “eye” for that ray. 


Distribution Ray Tracing 

Pistribution ray tracing and path tracing are the most elegant and complete of the 
ray tracing methods we have seen so far. In this section we will concentrate on 
distribution ray tracing as an example of how such algorithms work. We’ll begin 
with an algorithm that is inefficient but straightforward, and then add a small twist 
that will improve the efficiency dramatically. 

Before the ray is sent into the environment, we can attach descriptive information 
to it. For example, we can select a frequency for the wavelength of light that the ray 
is destined to carry; this allows us to sample the visible light spectrum anywhere we 
want, and thereby carry out color anti-aliasing . 

We can also choose a time for the ray. Suppose that the lens is covered by a 
shutter which opens momentarily, like the shutter on a camera. Then each part of 
the lens is only exposed for an interval of time, and each part of the image plane 
receives illumination over the percent of the interval when it is exposed. To integrate 
over that interval, we use Monte Carlo methods to attach a time t to the ray. 



MOURI 19.31 


(a) Viewing the environment through a thin lens, (b) Sampling the view. 


The selection of 2D image points, 2D lens locations, frequencies, and times may 
all be influenced by stratification and importance sampling. If we decide to stratify 
each dimension into n pieces, then a complete sampling would require n 6 rays for 
these six dimensions. Happily, such complete sampling is not required, as we will 
see below. 

Once it is constructed, the ray enters the environment. It is common to speak 
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of finding the first intersection of the ray with the environment. This is because 
by construction the ray has its origin in the camera model and is pointed into the 
environment. In fact, this is just a computational device, since light will travel along 
the ray from the environment into the camera. 

Because of this apparent backward direction of the ray, this type of ray tracing 
is sometimes called backward ray tracing . I do not recommend the use of this 
term, because historically the same name has been used to describe algorithms that 
trace rays in the opposite direction, from the light sources into the camera model. 
The terms backward and forward have become sufficiently confused that it would 
be difficult to recommend a single usage here that would be consistent with the 
literature. I therefore suggest abandoning those adjectives, and instead use visibility 
tracing for this operation. Almost any terms suggested for these two senses of ray 
tracing can probably be argued as ambiguous under some interpretation, so I will 
simply use this name for this interpretation consistently in this book. 

The most important task associated with this ray is finding the first object it 
strikes. This involves using a library of ray-object intersection routines , which 
provide the intersection point for a ray with each kind of object that may be in the 
scene. Many such routines have been developed for primitives ranging from spheres 
and planes to surfaces of revolution, fractals, and complex aggregate shapes. Such 
routines range from the simple to the very complex, and we will not review them 
here. An introduction to ray-object intersection algorithms may be found in Haines 
[177], and a thorough survey may be found in Hanrahan [186]. 

As a simple example of a ray-object intersection, consider the intersection of a 
ray and a sphere, as shown in Figure 19.32. The ray sweeps out points R along a 
parametric line defined by Po + Pis, where 0 < s e 72. Suppose we have a sphere 
with center C and radius r; all points P on the surface of the sphere satisfy the 
equation (P - C) • (P - C) = r 2 . This is a particularly nice pair of equations, 
because the ray equation is explicit in the parameter s and the sphere equation is 
implicit for the point P. 

Where the ray and the sphere intersect, both equations are satisfied, which means 
that there is a value of s that can be plugged into the ray equation that generates a 
point which satisfies the sphere equation. So we can plug the ray equation into the 
sphere equation: we find 


0 = (P - C) • (P - C) - r 2 
= (P • P) - 2(P • C) + (C • C) - r 2 
= (Po + Pis) • (P 0 + Pis) - 2((Po + Pis) • C) + (C . C) - r 2 
= s 2 (P l • Pi) + 2s(P q • C) + (Po - C ) 2 - r 2 (19.27) 


This last equation forms a quadratic equation for s with the well-known solutions 
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A ray-sphere intersection. 


given by the quadratic formula: 


s i 


-b + d 

2 a 


-b-d 


S 2 


2 a 


( 1 ^ 28 ) 


where 


a = (Pi -Pi) 
b = 2(P 0 • C) 
c=(P 0 - C ) * 2 - r 2 
d = y/b 2 — 4ac 


If the discriminant d is less than zero, then the solutions are imaginary; in geometric 
terms the ray does not hit the sphere, as shown in Figure 19.33(a). If d = 0, then 
both roots are the same; the ray is tangent to the sphere, as shown in Figure 19.33(b). 
Finally, if d > 0, then there are two real roots, and the ray passes through the sphere, 
as shown in Figure 19.33(c). 

When d > 0, we want to select the value of s that is the smallest positive value; 
this will then give us the point of intersection P = Po + P\ s. Note that if the ray 
starts within the sphere, one value of s will be positive and the other will be negative. 

We have presented the simplest algebraic solution to this problem without any 
concern for efficiency. This method is compared with a more thoughtful approach 
by Haines [177], who shows that this intersection computation may be significantly 
optimized by careful analysis. 
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(a) d < 0: The ray misses the sphere, (b) d — 0: The ray is tangent to the sphere, (c) d > 0: The 
ray passes through the sphere. 




1 026 


1 9 RAY TRACING 


The ray-object intersection routines themselves are notoriously expensive and 
consume a lot of computer time; a famous statistic due to Whitted is that ray-object 
intersection calculations occupied 95% of the compute time for images using his 
original algorithm [477]. As more sophisticated objects are included in scenes and 
the intersection algorithms grow more complex, the ray-object intersection cost can 
be expected to grow even larger. 

To reduce this expense, a plethora of acceleration methods have been designed to 
speed the process, primarily by eliminating from consideration those objects which 
certainly cannot be the first intersection object. This exclusion is usually based on 
the relative geometry of the object and the ray. The methods usually build a data 
structure which is at least partly in the physical space of the scene, and perhaps also 
in the phase space of the rays. There are many such acceleration algorithms; a survey 
is presented by Arvo and Kirk [17]. 

In general, there are three major approaches to such geometrically based accel¬ 
eration methods: bounding volume hierarchies , space subdivision , and directional 
subdivision. Figure 19.34 shows a set of books on a shelf. If we wanted to find the 
intersection of a ray with these books, the brute-force method would be to compute 
the intersection of the ray with each book, and then choose the intersection nearest 
the ray origin. This method would work, but since each intersection test is expensive, 
then the overall cost could be quite high. Such costs grow surprisingly quickly; a real 
book has a complicated structure containing many pages, and the front and back 
cover are generally not perfectly flat polygons but curved in some cases. Intersecting 
all this geometry can be expensive. And if the bookshelf is replaced by a more com¬ 
plex database, such as the tens of thousands of books on library shelves, then each 
ray will be prohibitively expensive to trace. 

One way to speed up the intersection test is to place a bounding volume around 
the books. For example, suppose we place a single large box around all the books. 
The test for intersecting a ray with a box is relatively cheap compared to intersecting 
the ray with a book. So when the ray approaches the books, we test it against the 
box; if the ray misses the box, it certainly misses everything inside, and we need not 
test any books at all. We have successfully culled this entire set of books from the 
candidate list of objects that might represent the first intersections with this ray. 

In fact, we can build a hierarchy of these bounding volumes, nesting one inside 
the other; a strategy originally suggested for ray tracing by Rubin and Whitted 
[363]. Figure 19.35 shows a couple of levels in this subdivision. When the ray first 
reaches the box, we can test it against the root of the hierarchy, representing the 
large enclosing box. If the ray misses the root, then we look no further inside this 
particular hierarchy. But if the ray does intersect the box, then we must look inside. 
Rather than plunge immediately into intersecting all the books, however, we can test 
the ray against the two sub-boxes shown in Figure 19.35(a). The ray will strike one 
of these boxes before the other; so we can look inside the nearer box first. If we don’t 
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A set of books on a shelf. 


intersect any of the books in this nearer box, then we can look inside the other. This 
process may be applied recursively; Figure 19.35(b) shows another step. 

In general, such methods build the hierarchy from the bottom up, first organizing 
groups of objects into small clusters, and then clustering the clusters to build a tree. 
The classical method for this construction is due to Goldsmith and Salmon [163]; a 
particularly efficient set of bounding volumes are discussed by Kay and Kajiya [243]. 
A common characteristic of bounding-volume hierarchies is that they tend to keep 
objects within a single bounding volume. 

Another approach to accelerating the first-intersection test is to subdivide the 
space in which the model is embedded. Figure 19.36 shows the books inside a 
regular 3D grid of cells. When we build this grid, we attach to each cell a list of all 
the objects that are inside it. Note that the objects need not be cut up by this process; 
a single object may reside in multiple cells simultaneously. When a ray strikes the 
edge of this grid, we determine which cell it enters first, and look for intersections 
with objects in that cell. If no intersections are found, the ray is propagated to the 
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(a) The second level in the bounding volume hierarchy, (b) The third level. 
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A space-based subdivision. 


next cell and the objects there are tested. If the ray does intersect one or more objects 
inside the cell , the nearest such intersection is easy to find. Because objects can reside 
in multiple cells, we need to make sure that a ray-object intersection really occurs 
within the given cell. 

This method was introduced for ray tracing by Glassner [153], Kaplan [241], 
and Fujimoto et al. [150]. Both Glassner and Kaplan subdivided space using an 
octree, while Fujimoto et al. used a regular grid. A feature of the octree is that it is 
able to adapt to local variation in object density; where there are many objects in 
a region of space, there can be many octree cells, so each cell contains only a small 
number of objects. In large empty areas we can pass through large quantities of space 
with a single step through a large cell. Unfortunately, this nonuniformity requires 
some processing in order to advance the ray from one cell to the next, because two 
consecutive cells visited by the ray may have different sizes. On the other hand, it 
is easy to advance a ray from one cell to the next in a regular grid; in fact, it can be 
done with integer arithmetic. But each cell now contains however many objects fall 
within it, and to traverse empty regions, we must take many steps through empty 
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A direction-based subdivision. 


cells. The choice of whether to use adaptive or uniform subdivision must be made 
based on the scene to be rendered and the characteristics of the implementation and 
the computer; a comparison of these methods is given by MacDonald [278]. 

Finally, we can subdivide based not only on the spatial characteristics of the 
database, but also on the directional distribution of the rays that sample it. This 
idea was originally used by Arvo and Kirk [16] in an algorithm that combined space 
subdivision with directional subdivision. Figure 19.37 shows a simple subdivision 
of the environment into small volumes that correspond to different directions that 
may be followed by a ray from a common origin. Arvo and Kirk developed a 
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multidimensional subdivision method that built the subdivision cells dynamically as 
the scene was being rendered. 

This brief discussion has only hinted at the wealth of algorithms developed for 
accelerating ray-object intersections; the interested reader is encouraged to consult 
the references for much more detail. 

Once the first ray-object intersection has been found, we need to determine the 
light leaving the intersection point s and returning to the eye point E . In general, 
we assume that the shading point is being queried by a ray that carries light away 
from the point in an outgoing direction 3°. Note that using our convention, this 
outgoing direction is opposite to the direction of the ray which struck s; that is, 

Cd° = -(P-E). 

To compute the shading at this point s, we use OVTIGRE from Equation 17.16: 

L(r,3 0 ) = L e (*i,3 0 ) + L p (s,3 0 ) 

= L e (s,3°) + [ f(r,3^3°,\)L(r,u})cosO r doj (19.29) 

y©; 

The emission term L e (s,u>°) we can find directly from the surface definition at s. 

Distribution ray tracing uses a particular form of the stratification technique 
discussed earlier to compute the propagated term L p (3,3°). We subdivide it into 
two separate integrals, one over the set of directions representing the luminaires (that 
is, direct light), and the other over the set of all other directions (that is, the indirect 
light). Since they combine to make ©J, the direct set and the indirect set P must 
together form the set of all incident directions: T d U P = 0*. So the propagated 
term may be written 

L p ( s, u5°) = [ /(r, uj -> d7°, A)L(r, d;) cos 0 r d3+ [ /(r, 3 -» Q°, A)L(r, 3) cos 6 r d3 
Jr d Jr* 

(19.30) 

The direct set T d is found by identifying the luminaires and stratifying them 
into the sets Gd . These surface strata are then converted to direction strata using 
T d = II(Gd), as in Figure 19.14. Then from above, the complement of the direct 
light with respect to the incident sphere is the indirect light: P = Q\ — F d . We 
can use any integration method to estimate these integrals. Using the ray tracing 
approach we can find the direct contribution by sending rays from s to each of the 
strata on the luminaires. Those that are blocked by other objects are added to the 
indirect component. Notice that this knowledge of a blocking object can be used to 
help us refine the visible stratum on the luminaire. 

To estimate the indirect contribution, we can send out a variety of rays in different 
directions, using a combination of explicit strata and importance sampling. This is 
illustrated in Figure 19.38. This sampling may be generated and adaptively refined 
using any of the uniform or nonuniform methods in Chapter 10; each of those 
methods yields an algorithm with different performance features. 
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Sampling the indirect contribution. 


The recursive generation of samples continues until it is explicitly stopped. There 
are many reasons to stop the recursion: the ray may strike a fully absorbing surface, 
it may escape from the environment (by hitting the enclosure sphere), or it may be 
terminated. Termination criteria for random walks were discussed in Chapter 16; 
the important thing to note is that as a sample moves deeper into the environment, 
it will make less of a contribution with each step, and eventually the contribution to 
the radiance at L will be so small that the sample can be considered negligible. To 
avoid introducing bias, we can’t simply stop at some cutoff, so a technique such as 
Russian roulette should be used to determine when to terminate a ray. 

Once all the samples have been generated and their intersection points evaluated, 
the BDF is applied to the irradiance, which is simply the radiance along each sample 
times the cosine of the angle of incidence of the sample. The resulting propagated 
light is then added to the emitted light, and that’s the radiance that is sent back away 
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from s in the direction u5°, back along the ray that struck s, though in the opposite 
direction. 

An example of this sequence in action is shown in Figure 19.2. The rays form a 
ray tree , with the lens point as the root and each intersection point represented as a 
node. The arcs in the tree represent the rays themselves. We build the tree from the 
top down, starting at the lens and determining the intersections with objects as we 
work our way into the environment. Then we pass shading information back up, 
starting at the leaves and combining propagated with reflected light until we make 
it back up to the lens, where the information can be stored as a screen sample. 

We need to keep in mind at all steps in this process that we’re using samples 
to represent a continuous signal; the problems of undersampling, and thus aliasing, 
crop up all along the way and must be addressed through appropriate choice of 
sampling density, prefiltering of the database, or nonuniform distributions of sample 
points to trade structured aliasing artifacts for structure-free noise. We want to avoid 
structured errors even in the illumination estimates (which we normally never see 
directly) because they are propagated in a nonuniform way by the BDF at the surface. 
If there is a particularly bad artifact right where the signal has great influence, the 
effect of that artifact can be greatly multiplied. The best bet is to keep the average 
error in any region low. So although we noted in Unit II that a noisy signal may 
have the same overall error as a signal containing structured aliases, it distributes 
that error more uniformly, and thus is more appropriate for this sort of application. 

The explosion of rays in this algorithm is considerable: we said that a complete 
sampling of a six-dimensional parameter space of rays with a density of n samples 
required n 6 rays at the screen per pixel, and each of those rays may create many 
new rays at each ray-object intersection. The whole process can be brought under 
control by using incomplete block sampling. Consider just the two variables ( x,y ) 
that describe an image location. If we subdivide each axis into four pieces, this implies 
that we need sixteen samples to sample the domain, as shown in Figure 19.39(a). 

Suppose that rather than require a completely filled block, we only require that 
the marginal distribution of the block on each axis be filled; that is, there must be one 
sample in each of the x strata and each of the y strata. We need only four samples 
to do this job; Figure 19.39(b) gives an example. The pattern in Figure 19.39(b) is 
highly correlated , which can produce errors (recall our discussion of Figure 10.44). 
As we saw in Chapter 10, there are a variety of ways to distribute samples in this 
grid that avoid producing correlated patterns. This same idea can be extended to 
any number of dimensions, so that for an n-dimensional space, where each axis has 
been subdivided into s pieces, we need only s well-chosen samples. 

A summary overview of the entire ray-tracing process is shown in Figure 19.40. 

Figure 19.41 (color plate) shows an example of an image produced by distributed 
ray tracing. Note the soft shadows, produced by numerically integrating over the 
solid angles occupied by finite light sources, and the motion blur, produced by 
numerical integration over the time duration of the exposure. 



FIOURI 19.39 


(a) Complete block sampling, (b) Structured incomplete block sampling, (c) Unstructured incom¬ 
plete block sampling. 


Gathering indirect illumination by distribution ray tracing is very expensive. To 
cut down on the cost, Ward et al. save this information each time it is computed 
[470]. When a ray samples a surface, they first look around to see if there are 
one or more nearby, already computed indirect illumination signals. If so, they 
are interpolated to produce a signal at the shading point. The assumption is that 
indirect illumination arrives mostly from diffusely reflecting surfaces, and that the 
light received from such surfaces changes little as we move about on a receiving 
surface. Figure 20.4 shows an image generated with this approach. 










MOURI 19.40 

An overview of distributed ray tracing. 


19.3.4 Discussion 

The advantages of beam and cone tracing are that they are able to dynamically create 
and evaluate entire surface strata at once. This can be very efficient, particularly with 
the use of constant-time filtering methods such as mip-maps [480] and sum tables 

[HI]. 

On the other hand, the geometry of reflection and refraction is difficult to model 
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accurately with these methods, and in complex databases the shapes of the strata 
may become very complex; this can lead to highly fragmented beams and large 
numbers of cones in order to match oddly shaped solid angles. More discussion of 
the geometry of beam tracing may be found in Dadoun et al. [114]. 

The point-sampling methods of classical ray tracing, distribution ray tracing, 
and path tracing all stratify the environment on the fly, differently for every sample 
taken. This is sufficiently expensive that the stratification is usually quite coarse: the 
brightest luminaires generate surface strata, and a few reflection and transmission 
strata are generated. Resolution is performed on the fly when an indirect sample 
strikes a luminaire. Path tracing is attractive because it does not produce bushy 
trees . Note that distribution ray tracing (and classical ray tracing) create ray trees 
that tend to get thicker as they grow deeper; because many rays are generated at each 
intersection. Kajiya has observed that the rays at the bottom of the tree are the ones 
that contribute the least to the final image [234], so we’re spending the most amount 
of time and work where it has the least impact on the result. Path tracing places as 
many rays at the root as it does deeper in the tree, but because the stratification is 
so sparse (a single point) for each intersection, path tracing typically requires more 
rays overall than distribution ray tracing for an image of the same error with respect 
to an ideal reference. 

Because the ray-tracing methods discussed here do not explicitly construct strata, 
they must do so implicitly in order to find the radiance returned by the stratum 
along the ray that samples it. One common approach is to simply propagate the 
degenerate strata approach throughout the environment: each sample is a point 
and all other points may be ignored. We know from signal processing that this 
method of point sampling can lead to undersampling, and hence aliasing. To reduce 
structured aliasing, the points can be generated in a nonuniform pattern, but we can 
still miss large structures. It would be convenient to combine the explicit surface 
stratification of the solid-angle approaches with the dynamic sampling of the ray¬ 
tracing approaches. 

Such a combination has been suggested by Glassner [158]. In this approach, 
any ray-tracing method is used to sample the incident light until the signal is con¬ 
sidered acceptable. During this process, the complete ray-object intersection tree 
of each ray is recorded. When sampling is complete, the illumination information 
computed in the ray-tracing pass is discarded, and the trees are retraversed (since all 
the intersections have been stored, this traversal requires no new intersections). At 
each node, the complete distribution of samples on all objects intersected from that 
node (including luminaires) is used to induce a stratification on the environment, as 
shown in Figure 19.42. Notice that the rays that extend into the environment past 
the first-intersected object help to refine the visible strata on objects farther away, 
including those on the backs of objects. The radiance sent from each surface stratum 
to the shading point is then estimated, and this is used as the incident radiance at 
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MOURI 19.42 

Dynamic stratification. 



that point. To find these propagated radiances may require descending the tree and 
inducing new, unique sampling patterns on the environment from the subnodes. 


19.4 Photon fracing 

Photon tracing involves generating photons at the light sources in a scene and dis¬ 
tributing them into the environment. Each photon has an associated frequency v 
(and thus energy related by E — hi/). If we really traced individual photons in 
an environment we would never get a picture made in practical time; each photon 
simply carries far too little energy. Furthermore, in a complex scene many photons 
will be absorbed before striking a surface that will make a contribution to an image 
from a given point of view. 

In 1968 Appel published an algorithm where random photons were followed 
from the light source, and the first point intersected by a photon was projected to 
the virtual screen. Rather than store the image in computer memory, Appel directly 
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MOURI 19.43 

A shaded drawing of a machine part produced by photon tracing. Reprinted, by permission, from 
Appel in AFIPS 1968 Spring Joint Computer Conference , fig. 14, p. 44. 


drew results as they were computed on a plotter [11]. If the point wasn’t blocked by 
any other object between itself and the screen, a dot was placed by the plotter at the 
appropriate location. After enough dots had been projected, a photonegative of the 
image would have white areas in regions of high illumination and black areas where 
illumination levels were low. Figure 19.43 shows an example of this procedure on 
a machine part, where the rays were generated in a regular pattern, and a plus sign 
was used instead of a dot. Notice the complex shadows created by light passing 
through the hole. This method doesn’t generalize well for complex scenes, and it 
fails to take any indirect illumination into account. 

A common optimization for this approach is to assume that not just one, but 
millions of photons or more are produced by the light source in each direction per 
unit of time. Then, as those photons enter into the environment, we can speak of 
what happens to the aggregate collection, rather than individual particles. 

For example, suppose that 100 photons leave a source in the direction of a 
receiving patch, and they all arrive. If the patch is a purely diffuse reflector with 
reflectivity p — 0.6, then on average six of ten photons will be reflected, and four of 
ten will be absorbed. If we traced the photons individually, then each time a photon 
struck the patch, we would either absorb or reflect that photon, with a 40% chance 
of absorption. The absorbed photons do nothing for us except make the patch a bit 
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warmer (which can influence its thermal emission in the visible band if the heat gets 
high enough). Except for this possible influence on the heat radiated by the patch, 
the computationally expensive process of following this absorbed photon has been 
wasted. It’s better to follow a packet of photons, absorb 40% of them, and then 
follow the path of the remaining 60%. 

The direct simulation of photons streaming from the light source into the envi¬ 
ronment has been studied in detail by Pattanaik [332] and Pattanaik and Mudur 
[333,334]. They generate photons at the sources using importance sampling, in 
order to make sure that the distribution of photons into the environment matches 
the energy distribution of the luminaire. Each time a photon (or photon packet) 
strikes a surface, the location of the intersection must be stored, and the amount of 
energy reflected (and transmitted) at that point must be recorded on an illumination 
map [13]. An example of the simulation is shown in Figure 19.44 (color plate). 

In general, every photon-surface intersection will be at a different point, so we 
have a seemingly impossible storage task. Pattanaik and Mudur instead discretize 
(or mesh) the environment prior to rendering, just as in radiosity [333]. All of the 
intersections and reflections within a patch are lumped together, and the re-emission 
of energy from the patch is determined by this aggregate result. Rather than save 
samples on a surface, Chattopadhyay and Fujimoto store the values in the nodes of 
a 3D grid in which the scene is immersed [80]. 

Deciding how many photons to shoot, where to shoot them from, and where to 
shoot them to are difficult issues. For example, consider a patch that reflects some 
of its incident light via diffuse reflection; in which directions should this light be 
propagated into the environment? Pattanaik uses importance (or potential) to help 
answer this question; thorough details are presented in [332]. This allows a very 
natural progressive refinement interpretation of the scene: at any moment during the 
simulation, we have accounted for some percentage of the photons that are traveling 
in the environment. We can render an image by simply looking up the number 
of photons which are radiated from each surface at this moment; waiting a bit 
longer will allow more photons to distribute, and therefore produce a more accurate 
simulation. The use of importance helps drive the simulation toward distribution 
photons where they will make the most impact on an image. 


19.5 Bidirectional Ray-Tracing Methods 

Visibility tracing and photon tracing may be combined into a multipass ray-tracing 
algorithm. 

The inspiration for this combination comes from the observation that visibility 
tracing is very poor at finding a good estimate for the indirect illumination on a 
point. Recall that we simply lumped together all the indirect illumination into some 
solid angle P, and said that some integration method would be needed to evaluate 
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PI4URI 19.4* 

A mirrored ball in a room. 


the illumination in this solid angle. Because potentially every object in the scene is 
visible to the point, lending it indirect illumination, it would take a great many rays 
to evaluate this indirect signal accurately. Typically some effort is made, but the 
error threshold is set very high, so that only a few samples are taken for this term. 

A challenging case for visibility ray tracing is a mirrored ball hanging in a room, 
as shown in Figure 19.45. A bright, tightly directed red spotlight shines on the ball, 
and the tiny mirrored facets on the ball reflect that light in many different directions. 
Eventually each mirror creates a small patch of illumination on a wall of the room. 
Now imagine a person wearing white cotton clothing (that is, primarily diffuse) is 
standing near the wall, between two of these patches (but not blocking either one 
from the ball). Suppose that the walls are coated with a diffusely reflecting white 
paint. We would expect to see the bright red light from the wall partly illuminate the 
person’s white clothing, causing spread-out red regions. This is called color bleeding. 

Consider trying to render this scene using visibility ray tracing. From the eye (or 
lens), rays are fired into the scene; suppose one of them struck one of the patches of 
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clothing that we would expect to be red. How is visibility ray tracing going to find 
that red light? Direct-illumination rays will probably be sent to the light source, but 
they will be blocked by the light’s enclosure. Unless we’re coming into the clothing 
at a very grazing angle, it will not reflect much light via specular reflection, so we’re 
left with estimating the indirect illumination. The problem is that we simply don’t 
know where significant indirect illumination might be arriving from, so we must 
simply sample randomly and hope to hit something useful. 

Of many visibility rays fired in the directions around this spot on the clothing, 
a few will probably hit the wall. Suppose we are lucky enough to hit one of the 
red spots. The problem now becomes one of finding the source of the bright red 
illumination on this patch of the wall; we have the same problem as before, since 
the wall is a diffuse reflector itself. Since the mirror that is causing this illumination 
occupies a very small solid angle from this point on the wall, it is unlikely that we 
will hit it by random sampling of the environment. The chance of getting a complete 
path from the clothing to the red spot on the wall to the mirror is not zero, but it is 
small. In practice, visibility tracing will in general fail to find this illumination. 

Note that in situations like this we might be able to fix the odds; if there are 
only a few specular surfaces in the room, then we can try each one as a possible 
source of illumination. In other words, we create strata on the specular surfaces and 
then sample those strata as direct sources rather than as part of the overall indirect 
illumination solid angle. Then we would hit one of the mirrors, and the specular 
reflection from there would take us to the light source. Following the chain back in 
the opposite direction, the light will finally make it to the white clothing. But if there 
are lots of specular surfaces, then this method becomes impractical. 

Instead of visibility tracing, we try photon ray tracing. A common use of photon 
ray tracing is just like classical ray tracing in reverse: we generate photons from the 
light source and follow them into the scene. If a photon strikes a specular surface, 
then we reflect it and continue following it. When the photon strikes a diffuse 
surface, we simply deposit its energy at that point on the surface and stop following 
that energy bundle. 

A convenient way to describe the chain of events experienced by a ray of light is to 
use a notation introduced by Heckbert [202] which builds a short string of symbols 
representing creation, absorption, and the various intervening states. Emission of a 
photon from a light source is written L, and absorption at the eye (or intersection 
with the image plane) is written E. Along the way from L to E the photon may 
interact with a volume V , it may be specularly reflected or transmitted 5, or it may 
be diffusely reflected or transmitted D . The sequence is written left to right over 
time, so when they appear, L is the first character and E is the last. We use stan¬ 
dard computer-science regular expression symbology [271] to indicate compound 
expressions: subexpressions may be grouped in parentheses, an asterisk superscript 
* represents 0 or more repetitions, and a plus-sign superscript + represents 1 or more 
repetitions. A term in square brackets is optional; it may be included or not. The 
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vertical bar | represents a selection among members; when a group is repeated, the 
selection may be different on each repetition. For example, (S|T/)* represents an 
empty sequence, and the sequences 5, V, SSV , VSSVSVV , and so on. 

Classical ray tracing only models specular reflections and transmissions (both 
represented by the letter S) in vacuum, so it can be described as modeling L[D]S*E 
paths, illustrated in Figure 19.46. We call L[D]S*E the characteristic expression for 
the classical ray-tracing model. In words, there are four different types of strings that 
this expression can generate, and hence the same number of different light paths that 
can be captured by the classical ray-tracing model: Li?, LDE y LS*E> and LDS*E. 

The expression LE corresponds to light that is directly visible from the eye; this 
represents rays that look directly upon a light source, as shown in Figure 19.46(a). A 
path of the form LDE represents the light from a source directly striking a diffusely 
reflecting surface which is immediately visible, as shown in Figure 19.46(b). Strings 
of the form LSE , LSSE , LSSSE , and so on, represent light that has been captured at 
the eye after specular reflection off of a series of surfaces, as shown in Figure 19.46(c). 
Finally, a string such as LDSSE represents light that has been diffusely reflected and 
then specularly reflected twice before reaching the eye, as shown in Figure 19.46(d). 

The form of the characteristic expression L[D\S*E is directly related to the clas¬ 
sical ray-tracing algorithm. We know that all paths end at the eye, E . As we search 
into the environment, we may strike a specularly reflecting surface. Direct illumina¬ 
tion arriving at that surface and reflected to the eye is represented LSE . Indirect light 
is of the form L • • • SE , where the dots indicate some series of interactions. Suppose 
that we strike another specular surface; then the direct light upon that surface is 
specularly reflected twice before reaching the eye, represented by the path LSSE , 
and indirect light follows a path L ••• SSE . Suppose that the next surface is diffuse. 
In classical ray tracing we simply gather only direct illumination at this point and 
bring it back to the eye, creating the path LDSSE. All of these paths are captured 
in the characteristic expression for classical ray tracing. 

Distribution ray tracing can in theory capture all possible paths; that is, L(S\D)*E. 
In practice, however, the capturing of diffuse information is sufficiently expensive 
that it is rarely carried out explicitly. However, near-specular reflection and trans¬ 
mission (gloss and translucency) are well modeled by this method, so we write its 
characteristic expression as L[D]G*E, where the specular term 5 has been replaced 
by the glossy term G. 

Photon ray tracing, on the other hand, generates paths of the form LS*[D][E\. 
In words, we start at the light and progress into the environment. If we strike a 
specular surface, we propagate the light (by reflection or transmission) to the next 
surface. When we strike a diffusely reflecting surface we stop, since it is unclear 
where to best propagate the energy. Notice that these paths don’t necessarily end at 
the eye; that’s because a ray may be absorbed rather than propagated. 

Suppose that we use photon tracing to carry light from the sources to the envi¬ 
ronment, and visibility tracing to gather radiance from the environment and bring it 







1044 


19 RAY TRACING 


back to the eye. Then the visibility paths need not start with L, since the distribution 
of the energy on the diffuse surfaces has already been accounted for; in other words, 
when striking a diffuse surface, we have already computed the direct illumination. 
Then the photon tracing paths LS*[D][E] and the visibility paths [D]S*E “meet in 
the middle” [202] to create paths LS*[D]S*E. This isn’t quite the full range of 
possible expressions, but it’s more than either algorithm can produce alone. 

This type of combination is called a bidirectional ray-tracing algorithm, since 
rays have been traced in both directions after both algorithms have been executed. 

Bidirectional ray-tracing algorithms were introduced by Arvo [13], and further 
developed by Chattopadhyay and Fujimoto [80], Heckbert [202,207], Pattanaik 
[332], Pattanaik and Mudur [333,334], and Ward et al. [470]. 

Figure 19.47 (color plate) shows an example result of Heckbert’s algorithm. The 
reflection on the bottom of the ball of the bright highlight on the ground is an 
example of a LSDSSE path, the richest type of path this algorithm can generate. In 
this image the meshing on the ground plane has not been smoothed, so the discrete 
(and nonuniform) nature of the received illumination during the photon tracing pass 
is easy to see. 


19.9 Hybrid Algorithms 

Bidirectional ray-tracing algorithms are a member of a larger class of synthesis 
algorithms called hybrid algorithms or multipass algorithms. The inspiration behind 
two-pass techniques is to observe that classical radiosity and classical ray tracing 
have complementary strengths and weaknesses. 

The power of classical ray tracing comes from the fact that the illumination signal 
is computed anew for each shading point. The direct light sources are stratified and 
sampled, and the indirect environment is sampled (or approximated from prior 
evaluations). This means that the method can capture sharp shadows; when a 
point no longer sees a source, it falls automatically into shadow, and if the source 
is sufficiently small, the shadow will be sharp. Specular reflections and refractions 
are also easily captured, since the proper illumination directions are evaluated when 
needed. All of these components of the illumination signal may be refined adaptively 
to any level of precision and confidence. 

The weak spot in classical ray tracing is that indirect illumination information is 
very expensive to evaluate accurately. This information comes from everywhere in 
the environment, and since we can only follow point samples, a complex environment 
requires many such samples. Significant sources of light can be missed if they aren’t 
luminaires themselves; the brightly focused light or caustic created by a lens on a 
surface can be difficult to find if the shading point doesn’t query the lens directly. 
We can specifically search out the specular surfaces, but illumination from multiple 
diffuse reflections is prohibitively costly. 
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On the other hand, classical radiosity algorithms excel at the evaluation of indirect 
illumination, particularly that produced by multiple diffuse reflection. This gives rise 
to soft shadows and color-bleeding , where the diffuse reflection of one colored patch 
influences the color of another patch. Each source that radiates energy is considered 
a first-class light source to the program; each patch is evaluated by the energy it 
radiates, not its diffuse or specular characteristics. 

The weak spot in classical radiosity is the handling of high-frequency detail. This 
is mostly due to the meshing that is at the heart of the radiosity technique: the 
resolution of the mesh limits the granularity of the representation of the radiance 
signal. No incident illumination can be computed with resolution greater than that 
of the mesh, and no propagated light can be distributed with any more precision 
than the mesh provides. 

The strengths and weaknesses of ray tracing and radiosity are complementary, 
and it seems reasonable to expect that a single algorithm that employs both methods 
should be superior to either one individually. This is the philosophy behind hybrid 
algorithms . Typically such algorithms are implemented by a sequence of radiosity 
and ray-tracing steps, and are therefore called multipass algorithms . When only two 
passes are used, one has a two-pass algorithm . 

The essence of all hybrid algorithms is that all the different types of light transport 
paths that will be handled are determined beforehand, and each type of path is 
handled only once. We must make sure when combining multiple rendering methods 
that no single type of light transfer is included more than once into the final radiance 
estimate. This can be tricky because some algorithms do need to follow the same 
paths multiple times; we must be sure to dispose of the extra copies. 

Most hybrid algorithms begin with a radiosity first pass to generate the result 
of multiple diffuse interreflection in the environment. Since radiosity solutions are 
view-independent (at least to within the assumptions discussed in Chapter 17), this 
solution may be stored with the model and used repeatedly for different views of the 
scene, as long as nothing changes except the viewpoint. This is then followed by a 
ray-tracing second pass , which adds in the view-dependent features due to specular 
reflection. 

This process is nicely described by Sillion and Puech [409]. Recall the operator 
form for VTIGRE from Equation 17.15: 

L = L e +/CL (19.31) 

Let’s divide the light transport operator JC into the sum of a specular term JCs and a 
diffuse term Kd\ this is equivalent to breaking down the BDF into two terms. Then 

L = L e + (ICd + ICs)L (19.32) 

Now we will define the diffuse distribution of light L d implicitly by the relationship 

L = L d + fC s L (19.33) 
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Comparing this to Equation 19.31, we see that it relates the final radiance distribution 
L at each point to the sum of the diffusely radiated component at that point plus the 
result of specular propagation of light throughout the environment. In other words, 
the diffuse term L d is the emission term; the diffuse radiation is “painted” onto 
the surfaces and they radiate it into space. If we propagate this diffusely reflected 
light into the environment and let it bounce around specularly, the result is the final 
radiance distribution L. 

If we isolate L 

(: I-JC s )L = L d (19.34) 

and use the Neumann series approximation from Chapter 16, we find 

L = (l-K s )- l L d 

oo 

= £(£ s )"L d 

n =0 

= Ks x L d (19.35) 

where we have implicitly defined the resolvant operator Ks°° (recall Equation 16.40 
from our discussion of the Neumann series in Section 16.6.3). 

Now if we can find the diffuse distribution L d y then we can find the complete 
radiance L . First, expand Equation 19.32, 


L = L e + K d L + )CsL 

(19.36) 

regroup, 

L - K S L = L e + K. d L 

(19.37) 

and apply the definition of L d to the left side: 


L d = L e + K d L 

(19.38) 

Now plugging in Equation 19.35 for L, 


L d = L e + K. D K.s°°L d 

(19.39) 


Equation 19.39 is equivalent to Equation 19.31, except that it expresses the radiance 
in terms of the diffuse component which is propagated around the environment 
by specular transfers. 

Hybrid algorithms generally compute an approximation to L d using radiosity, 
and then compute an approximation to /C£>/Cs°°L d using ray tracing. 

The first hybrid algorithm was presented by Wallace et al. [460]. It used a simple 
form of extended form factors that could only account for planar mirrors, but the 
serial staging of radiosity and ray-tracing solutions was presented. We show an 
example of hybrid rendering using this algorithm in Figure 19.48 (color plate). 
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The method proposed by Shirley [396] uses three passes, all implemented by ray 
tracing. Shirley first distributes energy from the light sources using photon tracing, 
resolves diffuse-diffuse interactions with a version of radiosity that uses ray tracing 
to shoot power from one patch to another, and then renders the image using visibility 
ray tracing. A result of this method is shown in Figure 19.49 (color plate); note the 
bright focused light (a caustic) on the tabletop created by the wine glass. 

Another three-pass method was developed by Heckbert [207]. The first pass is 
similar to a traditional visibility ray-tracing algorithm: rays are fired from the eye 
into the environment. The purpose of this size pass is to determine how densely 
each object in the scene will be sampled when projected to the image plane. This 
information is collected because in the second pass, called the light pass , light is 
fired from the light sources into the environment, in a distribution pattern initially 
determined by the results of the size pass: the idea is to make sure that the number 
of photons visible through each pixel is about the same. 

To see the reason for this, suppose that the scene being viewed is just a big flat 
polygon nearly perpendicular to the screen, viewed in perspective. If we didn’t use 
a size pass, then the samples from the luminaires would fall haphazardly on the 
polygon; when we integrated over small regions of the polygon to evaluate a pixel’s 
radiance, we would find some pieces of the polygon with no photons, and others 
with one or more. The result would be a splotchy appearance. So the size pass is 
used to subdivide the surface into surface strata which we know we will sample; 
those induce directional strata on the luminaires, and rays are fired outward through 
each of these directional strata. 

Finally an eye pass uses ray tracing to render the scene. Figure 19.47 shows a 
result of this algorithm; note that the ground plane has not been smoothed. 

Chen et al. developed a multipass method that can be interrupted to show 
partial results of different types [86]. They considered the broadest light trans¬ 
port path L(S\D)*E, and included extra D and S terms before the eye, creating 
L(S\D)* DS* DS* E. They suggested a very nice visual metaphor for this path, 
shown in Figure 19.50. The dark polygons represent a diffuse surface, and the white 
polygons represent one or more specular surfaces. 

They considered three types of paths: those containing no diffuse elements, those 
with one diffuse element, and those with two or more diffuse elements. The first 
and third cases each have their own algorithm; the case of one diffuse element 
is distinguished into two classes, depending on whether or not there are specular 
surfaces between the light and the diffuse element. Together, these classes account 
for all transport paths. 

The case of no diffuse elements corresponds to the path LS*E . This is shown in 
Figure 19.51(a). As indicated by the arrows, visibility tracing (that is, rays generated 
at the eye) is used to evaluate light taking these paths. Note that there might be no 
specular surfaces involved in this path at all; this would be a path LE indicating that 
we’re looking right into a luminaire. A path of the form LSE indicates that we’re 






The general path considered by the multipass algorithm. 



(a) LS*E. (b) LDS*E. (c) LS + DS*E. (d) L(S\D)'DS m DS m E. 


I 
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looking at the reflection of a light source in a mirror; a path LSSE is a light source 
seen in a chain of two mirrors. 

The case of a single diffuse element with no specular surfaces between itself and 
the light corresponds to the path LDS*E ; this is shown in Figure 19.51(b). Note 
that there may no specular surfaces between the diffuse patch and the eye, or many. 
Again, we use visibility ray tracing to find these paths. 

If there is a specular surface between the light and the diffuse patch, then there 
may be many paths. Recalling that S+ means one or more specular surfaces, this 
path would be written LS+DS'E, and is shown in Figure 19.51(c). The arrows 
in the path show that we use visibility ray tracing to reach the first diffuse patch. 
Photon tracing is used to generate photons at the light source, and then bounce them 
off of one or more specular surfaces until they arrive at a diffuse patch, where their 
power is recorded in an illumination map. 

Finally, there’s the general case of the path L(S\D)*DS*DS*E, as shown in 
Figure 19.51(d). Visibility tracing is used to find light paths that start at a diffuse 
surface, bounce off of one or more specular surfaces, strike another diffuse surface, 
and then bounce off of one or more additional specular surfaces before reaching the 
eye. Progressive radiosity is used to distribute light from the light source into the 
environment via multiple specular and diffuse bounces. As mentioned earlier; each 
type of path is accounted for once and only once. 

A result of this approach is shown in Figure 19.52 (color plate). The different 
light paths are displayed in different images. Notice the high-frequency information 
in the ray-traced caustics that are not in the radiosity caustics, and the richer varia¬ 
tion in diffuse interreflection computed by the radiosity method over the ray-traced 
interreflections. 

The hybrid approach has many other variations. The details involve making 
different approximations in the two methods, and the mechanics behind coupling 
them. Some pointers are provided in the Further Reading section. 


1 9.7 Ray-Tracing Volumes 

We can use ray tracing to evaluate volume data by using the full form of TIGRE, 
rather than the vacuum-limited form of VTIGRE. The practical means for efficiently 
evaluating the integration of scattering and volumetric emission along the ray are 
closely tied to the nature of the volumetric medium being rendered, and the particu¬ 
lars of its organization in the program. 

Some discussions of volume tracing may be found in papers by Bhate and Tokuta 
[43], Blasi et al. [45], Inakage [225], Kajiya and Von Herzen [236], Levoy [268,269], 
Nishita et al. [320], Sakas and Gerth [371], and van der Voort et al. [448]. 

An example of a ray-traced volume function from Levoy [268] is shown in Fig- 
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ure 19.53 (color plate). An example including atmospheric media from Inakage 
[225] is shown in Figure 19.54 (color plate). 


19.8 Further Reading 

The ray-tracing literature is vast. In particular, there has been extensive research 
into ray-object intersection algorithms and efficiency techniques for locating the first 
such intersection. There have also been a number of hardware implementations that 
exploit the natural parallelism in ray tracing (every ray is essentially independent of 
every other; so they may all be traced simultaneously). Much of this literature is 
summarized in the book by Glassner et al. [156]. 

Extensive information on geometrical optics involving lenses may be found in 
any optics text, such as Born and Wolf [55], Brown [66], Jenkins and White [230], 
and Moller [311]. 

Hybrid algorithms combining ray tracing and radiosity in various ways may be 
found in the papers by Bouatouch and Tellier [56], Bouville et al. [57], Chen et al. 
[86], Chen and Wu [82], Heckbert [207], Kok et al. [250], Le Saec and Schlick [259], 
Shirley [395,396], Sillion et al. [408,410], Wallace et al. [460], and Zhu, Peng, and 
Liang [505]. A variety of methods for storing illumination maps have been discussed 
by Vedel [453]. 

Efficient Monte Carlo sampling of the BDF for reflection and transmission is 
discussed by Bouville et al. [58]. The problem of sampling large numbers of light 
sources is discussed by Wang and Shirley [464]. 

Many different data structures and algorithms have been explored for accelerating 
the process of finding the first intersection of a ray with the environment. The seminal 
survey that organizes this field is by Arvo and Kirk [17], The acceleration structures 
may be combined in various ways; some discussions for such combinations may be 
found in Kirk and Arvo [245] and Glassner [154], 

Implementation of a ray tracer and ray-tracing architectures are discussed by 
Heckbert [209], Shirley [399], and Shirley and Wang [401]. 


19.9 Ixorcises 

IXGTCiM 19*1 

(a) Write equations for picking lens positions assuming a circular shutter on a 
circular lens that opens at uniform speed over a duration to, stays open for 
a time ti, and then closes at uniform speed again over an interval * 2 , as in 
Figure 19.55(a). 

(b) Repeat the exercise assuming a linear “guillotine” shutter that moves vertically 
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MOURI 19.55 

(a) A circular shutter, (b) A guillotine shutter. 


up, revealing the circular lens from bottom to top over time t 0 , staying open 
for time 1 i, and then closing over interval * 2 , as in Figure 19.55(b). 

IxtftlM 19.2 

Use a refraction argument to show that the lens points Q, C, and Q f in Figure 19.56 
are colinear accounting for refraction at the surface of the thin lens, even though 
QGC and CHQ' are not colinear. 

IXMtlM 19.3 

Different acceleration methods are best used for different types of databases. 

(a) For what sort of scenes are bounding volumes most appropriate? 

(b) For what sort of scenes is uniform space subdivision most appropriate? 

(c) For what sort of scenes is adaptive space subdivision most appropriate? 

(d) Can you suggest a means for automatically selecting and applying the right 
subdivision strategy for a given model? Would you recommend mixing meth¬ 
ods within a scene? How would you choose? 

Kurd** 19.4 

Read the works by Pattanaik [332] and Pattanaik and Mudur [333,334], and imple¬ 
ment an importance- (or potential-) based system for distributing energy from light 
sources. Is this an expensive algorithm? Can you make it more efficient? 
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Lens for Exercise 19.2. 


Ixorcis* 19.5 

Implement a hybrid radiosity/ray-tracing system using any radiosity and ray-tracing 
methods. Prove that your method doesn’t duplicate any light paths, even if it doesn’t 
capture all types. Demonstrate the range of optical effects you can model. 





If we assume . *, that natural signs can simply 
be copied from nature, the history of art 
represents a complete puzzle . It has become 
increasingly clear since the late nineteenth 
century that primitive art and child art use a 
language of symbols rather than "natural signs’* 
- - .All art originates in the human mind 3 in our 
reactions to the world rather than in the visible 
world itself and it is precisely because all art is 
“conceptual” that all representations are 
recognizable by their style. 

E, H. Gombrich 

(“Art and Illusion: A Study in the Psychology of Pictorial 
Representation,” 1960) 



RENDERING AND IMAGES 


20.1 Introduction 

This chapter is about closing the loop between the rendering program, the display, 
and the human observer. 

We have directed a lot of energy in this book toward evaluating the distribution 
of radiance in a scene. If the resulting radiance function is intended to be used to 
represent an image, then we need to understand what happens to our computed 
radiance values when we display them on a real device, and they are perceived by a 
real observer. 

In this chapter we will look at two quite different topics, which are related 
through their intimate connection with the displayed image. We will first look at 
postprocessing methods for converting the synthesized color values of an image to a 
set of displayable color values that will provoke the desired response in an observer. 
Then we will look at feedback-rendering methods, which use accurately displayed 
images to support the interaction of a designer with the rendered scene. 

It is important to know how closely our synthetic images match the reality they 
simulate. One way to make the match is to compare the results against experiments: 
this is the approach followed by Ward in his radiance system [469]. Alternatively, 
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you can display a synthetic image side-by-side with a real one, and ask observers if 
they can tell the difference; this approach was followed by Meyer et al. Meyer86a. 
This latter approach is a much harder road to follow, because it involves human 
observers, with all of their idiosyncrasies, biases, and complex visual systems. Both 
of these approaches have yielded encouraging results, but the match isn’t perfect. 

Until we are able to confidently assert that our synthetic images contain radiance 
values that are equivalent to what would be measured, experiments with human 
observers are premature (Meyer et al. did in fact make these measurements before 
continuing with the perceptual study). We need to have confidence that our simu¬ 
lation is right, and then we need to understand how to display the results of that 
computation so that it presents the image we intend. 


20.2 Postprocessing 

The information in Units II and III is intimately related; we cannot hope to accu¬ 
rately evaluate the radiance without using appropriate signal processing. But it may 
seem that when the radiance has been computed for every discrete location in the 
display device (e.g., every pixel in a frame buffer has a color), then our job as image 
synthesists is complete. This is not the case; in fact, the material in Unit I on the 
human visual system and displays is as important to image synthesis as the signal 
processing and physics. 

Every display device will affect the picture we intend to show, and that trans¬ 
formation will affect how the picture is perceived. When creating an image for a 
human observer, our goal is not simply to compute the most accurate representation 
of a physical scene, but rather to give the human viewer a particular perception of 
the image. If we want the viewer to think that the image on the screen looks like 
a window into a real scene, then we must account for what happens between the 
frame buffer and the brain as best we can. 

The essence of the argument is that there are physical limits on all display devices: 
they cannot come near the dynamic range of luminance in the physical world. Recall 
Figure 1.13, which demonstrated a luminance range of 16,000 candelas per square 
meter from lit snow to .00003 candelas per square meter from the sky on a moonless, 
overcast night: that’s a dynamic range of one hundred million to one! There is no 
display device that can come close to that range; film has a useful dynamic range of 
about 1000:1 [441], and CRTs are about 100:1 [467]. And we saw in Unit I that 
each display has its own color gamut, which is always a subset of the full range of 
perceptible colors. 

This means that except when we happen to make a picture that just fits the natural 
color and intensity range of the output device it is shown on, we are instead forced 
to display an approximation. So we will have failed in our goal to present an image 
that the observer will interpret as a view of a real scene. 
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PltURI 20.1 

Radiance values perceived directly and on a CRT. 


Or have we? The human visual system is nonlinear in its response. Perhaps if the 
entire picture is dimmer than a real scene, we will adapt to the overall luminance and 
then in that new state of adaptation the picture will appear correct. This is partly 
true, but then we have to assume that the colors are entirely within gamut, there is 
no ambient illumination on the CRT face or glare on the CRT, the phosphors are 
packed tightly enough together for the viewing distance, and so on. 

Even if all the display parameters are perfect, we still have trouble. For example, 
when the intensity of the light entering the eye becomes bright enough, it begins 
to scatter appreciably, causing bloom and other effects such as star patterns. The 
presence of bloom is a cue to our perceptual system that the intensity of the light is 
very high. 

The visual system is complex, and all our understanding still leaves us quite 
ignorant of many important perceptual cues. Still, if we want to provoke the intended 
response in a viewer, we must understand as well as we can what happens to our 
radiance values once we dare to display them. 

A useful way to think about this has been suggested by Tumblin and Rushmeier 
[442]. Figure 20.1 shows a set of radiance values (denoted L) that describe a real 
scene. We’ll assume for the moment that they are at frame-buffer resolution and 
represent the best possible color values for display on an optimal monitor under 
ideal conditions. The figure shows two paths to perception of these luminances, 
depending on whether the viewer sees them directly in the real world or on the front 
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Compensation for the device. 


of a CRT. The top path corresponds to direct perception of the radiance (say by 
looking at a real scene), and the lower path represents the perception of the scene on 
a monitor. 

The upper path involves the transformation of L by a nonlinear vision operator 
Vs- When looking at the real scene, the human visual system will adapt in a variety 
of ways to the incident illumination. The scene-adapted state of the vision operator 
V is written Vs- This operator (whatever it may be) processes an image and creates 
a perception of an image in the observer’s mind; the result of the operator is the 
perceived image VsL- 

On the lower path we see the radiance first passes through a display, where it 
is mapped through an operator V . This operator is intended to capture everything 
involved in the display of the image, including ambient lighting, gamma correction, 
color drift over time, and so on. The observer then looks at the display, but because 
the illumination is different, the adaptation of the visual system is different. We 
model the display-adapted visual system with the operator Vp, so the perceived 
image is VpPL. 

Unless VpP = Vs, the image seen on the screen will not produce the same 
perceptual response as looking at the real scene. We call actions taken to address 
this problem display compensation , and in general it is a post-processing method. 

The postprocessing approach is illustrated in Figure 20.2. The idea is to insert 
new operators along the display path so that the final operator result is Vs, not VpP. 
We do this by inserting three new operators. 
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Assume for the moment that we understand the visual system well enough that 
we know just what Vs and Vv do, and furthermore that both of these operations 
are invertible. Similarly, assume we know what V does, and that it too is invertible. 
All of these assumptions are dubious at best. But if we had these transformations, 
we could combine them to create the desired perception. 

As shown in Figure 20.2, we begin by applying the scene-adapted visual system, 
creating VsL; this creates the image we want to get into the observer’s head. But 
this picture will be seen by the display-adapted observer, so we precorrect for the 
transformation that will occur in that perception by multiplying with the inverse 
of that adaptation, creating Vv~ l VsL. This is the correct image, but displaying it 
through the monitor will distort it. So we correct for that distortion by preapplying 
the inverse of the monitor transformation, creating / D~ 1 Vv~ 1 VsL. Now everything 
unrolls: we display the picture using the operator X>, and it is perceived by the visual 
system using the display-adapted operator Vv, creating 

VvVV- l V v ~ l V s L = V S L (20.1) 

There is a subtlety here. You may have noticed that the Vv used in Figure 20.2 
is not the same Vv used in Figure 20.1. Since the displayed pictures are different 
(here it’s V~ l Vv~ l VsL rather than just L), the visual system will be presented with 
a different set of stimuli and will thus adapt differently. So Vp _1 is not a constant, 
but varies with the image being displayed. 

A postprocessing approach to display compensation such as this works with a 
collection of display values: these may be radiances, floating-point color specifica¬ 
tions, or even simply integer RGB values for a frame buffer. These methods are 
appropriate for application after rendering is complete but before display. Thus a 
single rendered image may be stored with arbitrary color information in a file, and 
then a different post-processing algorithm may be applied to it for each device on 
which it is to be shown. 


20 . 2 .1 A NomIIumt Observer Model 

A general approach to the post-processing problem was taken by Tumblin and 
Rushmeier [442]. They used a model like that in Figure 20.2, where two different 
forms of the visual system and an inverse representation of the display device were 
modeled. 

They addressed their work to the correct adjustment of CRT intensity values 
to compensate for adaptation in different environments. Imagine viewing a simple 
indoor scene lit by a single firefly. After you had time to adapt to the low illumination, 
a few of the most reflective objects in the room might be barely visible, but there 
would be little contrast, and most of the room would be shrouded in darkness. Now 
replace the firefly with an aircraft searchlight. Suddenly the illumination in the room 
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is increased by a factor of 10 11 . If you could stand to open your eyes, you would find 
everything washed out in a flood of light. Only the deepest and darkest shadows 
would interrupt the otherwise bright white visual field. 

If we were to render these two images, we would have radiance values that were 
all well below or well above what could be displayed on a monitor. The dark image 
would be completely represented by pixels of value 0, and the bright image would 
be well beyond the luminance put out by the CRT at maximum power. So we would 
be forced to somehow map both images to the restricted range of the monitor. A 
reasonable approach would be to scale each image so the darkest pixel went to 0 
and the brightest to 1. The result would be that both images would appear exactly 
the same! 

This isn’t a desirable result: after all the effort involved in computing an accurate 
scene simulation, we would hardly like to destroy important information at the very 
end. If a single room appears the same when lit by a firefly or a searchlight, then we 
have lost some information. 

To compensate for the change in brightness ranges between different devices, 
printers apply a tone reproduction operator T to the brightness values of the pixels. 
Typically, this operator implements a tone reproduction curve (or TRC), though in 
general the function may be more complex. Tumblin and Rushmeier have developed 
such an operator that attempts to capture the effect of adaptation. 

The tone reproduction operator T maps reals to reals, so it doesn’t deal with 
color directly. This is reasonable for a simple algorithm; modeling the human visual 
system is hard enough when dealing only with brightness. The Tumblin-Rushmeier 
model creates a composite operator T ~ 'D~ 1 Vt>~ 1 Vs- 

The starting point for their work was research published by Stevens and Stevens 
in 1953 and 1963. They gave models for the perceived brightness of a target on the 
basis of the adaptation of the observer and the luminance of the target. The clever 
idea behind these experiments was to adapt each of a subject’s eyes to a different 
background luminance. The left eye saw only a black field. The right eye saw a 
panel of some constant luminance L a and was given enough time to adapt to that 
luminance. 

Then a small spot (the target ) was displayed in each field, and the subject was 
asked how bright the target appeared. Brightness is measured in a linear scale of units 
called brils . A single bril is the sensation of brightness from a fully dark-adapted eye 
viewing a 5° target of 1 microlambert for one second. Brils are linear, so 2 n brils 
are twice as bright as n brils. Notice that brils quantify subjective brightness , not 
objective luminance or radiance. 

Assuming that the eye was shown (and adapted to) a 100% diffuse reflecting 
white surface with radiance L a , then a target of radiance L has a brightness of P 
brils given by 


log(P) = a log(L) + 0 


(20.2) 
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where all logarithms are base 10, and 

P is the brightness in brils, 

L is the target radiance in lamberts, and 
L a is the luminance of the surrounding field. 

The experimental constants are given by 

a = 0.41og(L a ) + 2.92 

(3 = —0.4(log(L a )) 2 - 2.584log(L a ) 4- 2.0208 (20.3) 

Often the brightness is denoted B, but that letter would clash with the commonly 
used letter for radiosity and the two measures would be difficult to distinguish just 
from context. 

Unfortunately, Equation 20.2 doesn’t generalize very well for complex scenes. 
More complex empirical models for brightness perception with respect to adaptation 
have been developed, but have not yet been explored for graphics [442]. 

The next step in building the operator T is to model the display device operator 
V . Tumblin and Rushmeier used the following model: 


where 


Ld is the screen display value of a pixel in lamberts. 

Ld ,max is the maximum screen luminance, typically 0.027 lamberts. 
v is the intensity stored in the frame buffer. 

7 is 2.8 to 3.0 for uncorrected CRTs, or about 1.2 if the 
display includes gamma correction. 

Lb is the background radiance in lamberts. 


The background radiance is the product of the ambient (or surround) radiance L s 
and the screen reflectance s, plus the result of internal reflections within the CRT 
itself, ultimately producing radiance L c , so Lb = sL s + L c . 

We can solve for the display value v for a desired radiance L d : 


v = 


L d 


L b 


L ^d, n 


Ld, n 


( 1 / 7 ) 


(20.5) 


Tumblin and Rushmeier note that the fraction Lb/Ld }in ax describes the ratio of the 
darkest to the brightest radiance achievable from the screen; this is the inverse of one 
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definition for the contrast C (using the definition of contrast as brightest to darkest 
luminances). Typical CRTs have a contrast of about 35 in normal conditions [466]. 

To find v for a given L <*, we now need only find L d . We find that value by 
asserting a brightness match between two observers, one adapted to the real world 
scene (using the subscript w) and the other to the display (using the subscript d). 
Then from Equation 20.3 we have the same perceived brightness P for the display 
pixel radiance Ld and the computed pixel radiance L^, so 


a w log (L w ) + fi w = a d log (L d ) + 0 d 


Solving for log {L d ) y we find 


log (L d ) = 


a w log (L w ) + 0 w -0 d 


( 20 . 6 ) 


(20.7) 


L d = L w {a ^ /ad) • (20.8) 

where we have noted that 10 alog ^ 6 ^ = b a . 

The only remaining step now is to find values for (a w ,0 w ) and {a d ,0 d ). These 
can be computed from Equation 20.3 if we can determine an appropriate value of L a 
to use. Tumblin and Rushmeier give a practical solution to determining this constant 
for both the real-world scene and the display image. 

They note from the Stevens and Stevens experiments that the human visual system 
tends to adapt not to the average luminance in the scene, but to a point where the 
most of the brightness is a fixed amount below the adaptation level L a . They reason 
that the logarithm of the adaptation level will be the expected value of the logarithm 
of the scene luminances, plus 0.84 to account for experimental data. That is, 


\og{L a , w ) = E[\og(L w )] + 0.84 


(20.9) 


This only holds if we imagine that the eye adapts to the entire scene at once. This 
generally isn’t true. For example, imagine an image of a room interior where you 
can see both the floor and the lights. If you look at the lights, your eyes will 
adapt to the bright illumination; if you look at the floor, you’ll adapt to the darker 
illumination. Nevertheless, this single-adaptation idea is a good starting point. The 
expected value for the image can be computed simply as the log average of all the 
computed luminances in the scene. This real-world value L a , w can then be used as 
L a to compute a w and 0 W . 

Now finding the display adaptation level L a , d is a bit trickier. We would like 
to just average the pixel luminances as in the real-world case, but we don’t know 
them yet. So instead we assume that they are evenly distributed, and we estimate the 
adaptation level as the ratio of the maximum displayable luminance to the square 
root of the maximum available contrast: 


La,d — 


(20.10) 
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This display value L a> d can be used as L a to compute ad and /?<*. 

Now we can put the pieces together. Plugging the value for from Equation 20.8 
into the relationship in Equation 20.5, we find the operator T, which maps computed 
radiances L w to pixel values v : 


v — 


^ L w Qw ^ ad • — — 

Ld y max C 


(1/7) 


( 20 . 11 ) 


Although Equation 20.11 looks formidable, most of it is made up of constants 
that are fixed for a given image. When the values of L a are known for the image 
and the display, the values of a and /? may be calculated. 

Figure 20.3 (color plate) shows a set of five scenes processed by this model. The 
brightest image contains an overhead lamp with an intensity of 1,000 lamberts; it 
is mostly washed out. Each successive image shows the result of a decrease in the 
lamp luminance by a factor of 100. The final image has a 10-microlambert light. 
No processing has been done to these images except to apply Equation 20.11. 

A pair of color figures generated with this method is shown in Figure 20.4 (color 
plate). Here the three color channels were adjusted independently. Figure 20.4(a) 
shows a cabin viewed by daytime illumination arriving through the window. Fig¬ 
ure 20.4(b) shows the same cabin viewed by artificial nighttime illumination from 
the overhead lamp; the overall illumination in the room is much lower. 


20.2.2 Image-Bostd Processing 

A different approach to constructing a tone reproduction operator for postprocessing 
has been reported by Chiu et al. [88]. They observed the problem of displaying a 
typical indoor scene that they rendered. The scene included a bare light bulb. The 
radiance values for pixels directly displaying the bulb were 500, and those on the 
floor were about .017, for a dynamic range of about 30,000 : 1, which is more than 
we can get from a CRT. 

They considered a variety of simple tone reproduction curves, similar to the type 
of simple choices we discussed for gamut mapping in Section 3.6. As we noted for 
the gamut-mapping problem, they observed that any TRC that applies uniformly to 
the entire image is unlikely to produce acceptable results. 

Chiu et al. made an interesting observation about the visual system that helped 
make the problem a bit easier to solve. Recall that the visual system is a poor judge 
of absolute values; it’s relative radiances and contrasts that we’re optimized to detect. 
In fact, they note that as long as it’s kept within a factor of about four, we can apply 
a slowly changing scaling factor to the image and it will be undetectable (the precise 
meaning of “slowly changing” depends on the picture and the adaptation level of 
the observer). They write the scaling factor as a 2D function s(i , k) for a grid of 
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pixels. The computed pixels themselves are p(i, fc), so the displayed image at each 
pixel (i, k) is given by s(i , k)p(i, k). 

One way to achieve a slowly changing scaling function s is to blur the image. 
Suppose that the blurred image has new pixels 6(z, k). Then the scaling function may 
be written as 


s(z, k) 


1 

hb(i, k) 


( 20 . 12 ) 


for some value of h . As the scaling function h pulls the brightest pixels into display 
range, but the whole picture darkens as well. They found that h = 8 was a good 
compromise for their test images. 

To compute the blurred picture, they experimented with several different filters. 
They discovered that the precise shape of the filter didn’t matter much as long as it 
was smooth and very wide. In fact, the filter had to be about as wide as the picture 
in order to avoid artifacts. They used the filter ae”° 01r , where r is the distance from 
the filter center and a is the normalization constant for the filter. Applying that filter 
to the image many times is prohibitively expensive. Such filtering is much easier in 
Fourier space, where the repeated convolutions become a single multiplication with 
an exponentiated version of the transformed filter (again a Gaussian). Chiu et al. 
chose instead to filter a subsampled version of the image and then interpolate to fill 
in the missing pixels. 

The approach still leaves some pixels above 1. Their solution was to clamp these 
pixels. Then the scaling function itself is smoothed several times using a much smaller 
filter to round the sharp edges introduced by the clamping process. Figure 20.5 (color 
plate) shows the blurred original image, the original scaling function, and the scaling 
function after clamping and smoothing. Pixels that were clamped were not allowed 
to change as a result of the smoothing. 

The result of this operation is shown in Figure 20.6 (color plate), where the 
image is the one that was blurred for Figure 20.5(a), and the scaling function is the 
smoothed, clamped version from Figure 20.5(d). Note that in Figure 20.6(a) there is 
a lot of light coming off the wall near the bulb that is simply clipped, causing a flat, 
white disk on the wall. This is turned into a nicely shaded glow in Figure 20.6(c), 
yet the floor is still visible. 

One clue that the luminance values in Figure 20.6(c) are not those we would see 
looking at the real scene is that there is no bloom (or glare). As mentioned earlier, 
when light is intense enough it scatters inside the eye, causing a halo around the 
brightest objects in the scene. We can include that halo in the image itself (this is 
another example of applying the scene-adapted vision operator Vs to the image). 

Chiu et al. modeled blooming with a local function that tends to spread out very 
bright illumination locally. The image with bloom is found by convolving the image 
with a blooming filter b(i, fc), which is a small nonlinear filter that spreads out very 
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bright regions. The blooming filter is given by 

(P if * = k = 0 

b(i, k) = < /(z, k) if y/i 2 4- k 2 < t£?/2 (20.13) 

v 0 otherwise 

where 

w is the width of the filter. 
p is the amount of blooming. 

/(*, *0 = tjP | v^ 2 + A; 2 - (w/ 2)|" 

F is the normalization constant for the filter. 

n is the bloom spreading factor (> 1). 

Figure 20.7 (color plate) shows the same image as in Figure 20.6(a), but after 
application of the blooming filter. 


20.2.3 Linear Processing 

As we saw above, the algorithm by Tumblin and Rushmeier [442] estimates the 
adaptation of the eye to the real-world scene and then shifts the luminances in the 
image to match the brightness perceived by an observer. For a given image, the 
transformation involves a scaled exponentiation of the image luminances. 

Ward sought a linear transformation that would produce similar results at lower 
cost [468], transforming the real-world radiance L w to a display radiance Ld using 
a scaling factor m: 

Ld — mL w (20.14) 

He noted that the effect of adaptation can be viewed as a shift in the absolute 
difference in luminance required in order to see a change. In other words, the 
difference Li - L\ might not be visible when adapted to one luminance level, but it 
might be easily visible when adapted to some other luminance. 

Building on the work of Blackwell, Ward began with a relationship that says if 
the eye is adapted to luminance level L a , the smallest change in luminance A L that 
can be seen satisfies 


A L(L a ) = 0.0594(1.219 4- L a 0 4 ) 2 ' 5 (20.15) 

where luminances are measured in candelas per square meter (recall from Table 13.2 
that lcd/m 2 = 7rx 10“ 4 lamberts). 

Now because the world luminances are mapped to the display luminances by 
Equation 20.14, we can map the smallest discernible changes in luminance as well: 

A L(La t d) — mAL(L atW ) 


(20.16) 
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where, as before, L a ^ is the adaptation level of the eye to the display, and L a , w is 
the adaptation level of the eye to the real-world scene. So the scaling factor m tells 
us how to map luminances from the world to the display such that a just-noticeable 
change in world luminance maps to a just-noticeable change in display luminance. 
Solving Equation 20.16 for m, we find 


A L{La 4 ) = r 1.219+ L M 04 ] 2,5 
A L(L a%w ) 1.219 + L a , tt) 0 \ 


(20.17) 


Now, as with the Tumblin and Rushmeier method, we need to estimate L a> d and 
L a .w Ward assumed that the display adaptation is at about half of the average radi¬ 
ance of the image, and that the average image is about equally distributed around the 
mean intensity (£d,max + £&)/2, where, as before, Ld, max is the maximum displayable 
luminance and L& is the background luminance. Ward further assumes that Lb is 
negligible, so Ld, a = Ld, max/2. He uses the same approximation as Tumblin and 
Rushmeier and uses the log average of the scene luminances to estimate L a , w - 

Plugging these values into Equation 20.17 gives us luminance values from 0 to 
Ld, max) so we divide by the maximum to get values in the range [0,1]. The scaling 
factor is then given by 


m = 


Ld, n 


1.219 + (£<j, max /2) 


0.4 


1.219-fL a 


0.4 


2.5 


(20.18) 


Ward suggests that a good value for Ld ,max is about 100 candelas per square 
meter for most CRTs; a photograph under indoor lighting can be as high as 120 
candelas per square meter. 

Figure 20.8 (color plate) shows an indoor cabin scene as viewed by sunlight in 
the day and by indoor illumination at night, using Equation 20.18. Notice that the 
nighttime scene is darker. 

Figure 20.9 (color plate) was generated assuming that the viewer had fixated on 
the cabin window and had adapted to the luminance there. In the daytime view in 
Figure 20.9(a), we can see the outer world through the window with greater clarity 
because it has been given more dynamic range in the image. The rest of the scene has 
reduced dynamic range. In the night view in Figure 20.9(b), very little light makes it 
through the window; at low illumination levels we can’t see such dark objects. 


20.3 Feedback Rendering 

The techniques described in the previous section allow us to display an image that 
will be perceived in a way similar to the perception of a real scene being directly 
viewed. 
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Feedback rendering. 


Given that ability, we can now interact with the resulting image on a higher level, 
expressing aesthetic judgements about the image that in turn affect the underlying 
scene and create a new picture. We call this feedback rendering. 

Feedback rendering is illustrated schematically in Figure 20.10. In this technique 
the image is rendered, then adjusted and rerendered (perhaps from scratch) until it 
satisfies some visual criteria. At first glance this looks like the typical interactive 
cycle of image generation: render, display, judge, change the scene, and repeat. The 
difference is that at least some fraction of the judging step is taken over by the 
computer, which makes automatic changes to the underlying scene representation in 
order to satisfy criteria associated with the displayed image. 

The feedback-rendering algorithms presented below use some information de¬ 
rived from an image to change the scene description, and then create a new image. 
Sometimes the information is in the form of high-level aesthetic judgements by the 
user, and sometimes it is based on the quality of the displayed image. 

The process usually involves expressing the 3D scene in terms of some tractable 
number of parameters, such as light colors or surface reflectivities. We call these the 
scene parameters. For a given picture, the n scene parameters represent a surface in 
an n-dimensional picture space. All of the possible images that can be generated by 
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that scene may be considered points in that n-dimensional space. Any operations in 
that space may be thought of as moving around on that surface. 

The task now becomes one of finding that point in picture space that has scene 
parameters that match the desired goals. This can be expressed as an optimization 
problem: given some criteria for the final image, find the set of scene parameters 
that produce an image that best satisfies those criteria. Optimization is not an easy 
job, and typically the sorts of problems we need to solve have a large number of 
dimensions and are very nonlinear. Finding the right optimization method for each 
task is important. 

To support feedback rendering, the Tenderer needs to save more information 
about a picture than simply the color values at each pixel. At a minimum, it needs 
to be able to determine which object is located at a given pixel. This is often 
accomplished with an object tag , which is an integer associated with each pixel that 
identifies which shape has been drawn into that pixel. The surface parameters for 
the object at that point are also often stored; means for doing this are discussed 
by Hanrahan and Haeberli [189]. This approach only works for pixels that have 
a single sample (and thus are usually full of aliasing artifacts), because if there are 
multiple samples in a pixel, it can become very difficult to disambiguate a designer’s 
intentions when that pixel is changed. This is acceptable, since interactive design 
is often carried out at lower resolution and image quality than the final image. 
When the designer is happy with the scene parameters, then a full (and anti-aliased) 
rendering may begin. 


20*3*1 liiwMlnntion Painting 

To create an image, we generally design a 3D scene description and then render it. 
The scene description is often created with a modeling program , which is designed 
to allow the interactive manipulation of the various 3D shapes in the scene. We can 
usually place cameras and lights as well, but the image is rendered quickly in order 
to provide rapid feedback; sometimes the modeling program only draws wireframe 
views of the scene. 

Thus when the scene is rendered with a bit more accuracy, the geometry is typically 
close to what was expected, but the shading can be way off. Often the designer needs 
to repeatedly adjust the colors, positions, and directions of the lights to achieve a 
desired result. 

Schoeneman et al. [382] noted that the desired illumination can often be expressed 
by the designer in the form of a sketch, by simply painting in the desired final colors 
in the scene. Rather than force the designer to manually adjust the lights to find a 
match to this sketch, the computer could try to automatically find a setting of the 
lights that does the job. 

The scene parameters adopted by Schoeneman et al. are the light intensities and 
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colors. The process begins with an initial rendering of the scene from the designer’s 
starting guess. The result is an image that is displayed. 

The designer then uses a paint program to interactively draw new colors on top 
of the rendered image. In effect the designer is telling the system the color of each 
surface after illumination. From the object tag, the system can determine which 
object is being colored. From the object geometry, the illumination of the object by 
each light can be determined. From the viewing geometry, the radiance of the object 
that is displayed on the screen can be determined. The job then becomes one of 
finding the right intensity and color setting for each light so that each object receives 
the necessary illumination, such that when it is projected to the screen, it has the 
color drawn in by the designer. 

To find these light colors, Schoeneman et al. use a constrained least-squares 
optimization method to find that set of light colors and intensities that best satisfies 
the desired combinations of the lights and the surfaces. Surfaces that have been 
painted receive more weight in this optimization step than others, based on the 
presumption that the designer has explicitly painted everywhere that matters; by 
implication, anything left undrawn can change freely. 


20*3*2 Subjoctlvo Constraints 

A related but more ambitious system was built by Kawai et al. [242]. They allowed a 
richer set of scene parameters, including not just the light intensity, but the direction 
and focus of spotlights, and the reflectivity of surfaces. Like Schoeneman et al., they 
restricted their system to diffuse surfaces. 

The light sources may be either purely diffuse emitters or Phong-style emitters 
whose distribution pattern follows a cos n 9 distribution around a direction vector V. 
The overall intensity, the exponent n, and the direction vector V may all be changed. 

The system built by Kawai et al. allowed a user to interactively specify constraints 
by selecting objects from a rendered image, and then setting values using a set of 
interactive buttons and sliders. One set of values were derived from subjective criteria 
based on the work of Flynn, published in the 1970s. 

Kawai et al. generated a number of different images of a conference room using 
different lighting configurations. They asked observers to rate those images with 
respect to the subjective feelings of clarity , pleasantness , and privacy . They also 
computed three objective measurements called brightness , nonuniformity , and pe¬ 
ripheral lighting . These measurements are based on the brightness P of each surface 
in the room. Each measurement takes some subset of the patches in the room, com¬ 
putes an area-weighted brightness measure, and then normalizes by the total area in 
the subset: 
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Brightness measures the overall energy of the environment by weighting every patch: 

/brightness = ~ ^ (20.19) 

where M is the set of all surfaces in the scene. 

Nonuniformity measures the brightness of the walls with respect to the average 
surrounding brightness: 


/nonuniform 1 


!Eiew(Pa4-Pi) 2 Ai 


Eh 


iew 


( 20 . 20 ) 


where W is the set of all the walls in the scene, and P a ,i is the average brightness 
of all the elements around patch i. 

Peripheral brightness measures the difference between the brightness of the horizon¬ 
tal and vertical elements: 


f _ PjAj PjAj (')() 71 \ 

J peripheral a a 

where H is the set of all horizontal elements. 

The remarkable thing about these objective measures (albeit based on the sub¬ 
jective measurement of brightness) is that they can be correlated to the subjective 
impressions of clarity, pleasantness, and privacy. This correlation was found despite 
the fact that the objective measures above don’t include perspective and hiding; that 
is, every surface in the scene is used in the calculation, weighted by its full area. 
This probably worked in their case because the test images were indoor scenes of 
a convex room where most of the surface area of the scene was visible. In more 
complex environments one would probably need to weight the area terms to use the 
area actually present in the final image. 

The amount of clarity, pleasantness, and privacy may be written in terms of the 
objective measures by the relationship 


/clarity 


' 0.90 -0.38 0.58 ' 


/brightness 

/pleasantness 

= 

0.78 -0.53 0.24 


/nonuniformity 

/privacy 


0.90 0.32 0.09 


/peripheral 


Kawai et al. allow the designer to interactively set weights on these six subjective 
and objective criteria, and also a weight on the overall energy in the room. They 
take these as the design constraints , which they try to satisfy. The system also 
includes barrier constraints which must be satisfied to produce a physically sensible 
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result; for example, reflectivities must lie in the domain [0,1], Finally, the Tenderer 
imposes its own physical constraints in the form of the conservation of energy in the 
environment. 

From the constraints a set of partial differential equations is generated, and a 
system solver is invoked to walk through the space of images generated by the 
scene parameters to find one that minimizes the error, computed using the designer’s 
weights. 

An example of their results is shown in Figure 20.11 (color plate). In Fig¬ 
ure 20.11(a) the table was constrained to have a small amount of illumination, while 
the overall effect had visual clarity. Figure 20.11(b) uses the same constraints except 
that an additional privacy constraint was added; notice how the lights have been 
directed away from the walls and down onto the table. 

Optimization processes in high-dimensional spaces can often produce unexpected 
results. Kawai et al. noted that the system maximized the brightness in the example 
conference room by pointing the lights at the ceiling. This was aesthetically unac¬ 
ceptable, so they had to manually add a constraint to keep the lights away from the 
ceiling. This anecdote demonstrates how difficult it is to design completely automatic 
procedures in situations that involve subjective design criteria; we often find what 
we want by eliminating what we don’t want. 


20.3.3 DtvlM-DIrectid Rtndtring 

The two methods discussed above actually rerender a scene in order to meet aesthetic 
design goals. A similar approach was taken by Glassner et al. [160] with a system 
tailored to perform gamut mapping. Recall that in Section 3.6 we discussed how the 
range of displayable colors (or gamut) varies considerably between CRTs, film, and 
print media. Suppose that we have created an image that, when viewed on a CRT, 
meets our design criteria, including overall brightness levels to simulate adaptation. 
There’s still the problem of getting the image from the screen onto film or paper 
without ruining the semantic consistency. Glassner et al. noted that if the image is 
rendered so that it fits the display gamut, then no distortion is needed. 

The range of colors of the pixels in an image cannot be predicted; we must actually 
render the scene. For example, suppose there is a very dark blue chair in an indoor 
environment; it is unlikely that the chair will lie in the gamut of most printers. Even 
if the chair isn’t directly visible, it may be indirectly visible in a reflection, say off 
of a tarnished candlestick. The material of the candlestick will influence the color 
of the reflection; the reflected blue chair might lie within gamut because it is darker 
or color-shifted. In general, multiple interreflections among objects in a scene will 
produce colors that are not present in the original objects. In other words, even if 
the original spectra are all within gamut, their combinations may not be. 

We could adjust the image colors on a pixel-by-pixel basis, as in a postproduction 
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method, but then we risk adjusting some reflections and not others. Suppose the 
blue chair is reflected in two candlesticks, one shiny and one tarnished. Suppose 
that they also reflea a red chairs where both red reflections are in-gamut, and as a 
result of gamut mapping we bring the out-of-gamut bluish pixels on one candlestick 
to the point where they look about the same as on the other candlestick. Then the 
blue refleaions look the same, but the red refleaions don’t: we are simultaneously 
being told that the candlesticks reflect light equally (the blues are the same), and that 
they don’t (the reds are different). This is a violation of semantic consistency in the 
image: the sense of what the picture represents has changed as a result of adjusting 
the colors. 

Rather than adjust the image, we can change the colors of the objects and lights 
so that the computed image is completely within gamut. Then no postproduction 
would need to be applied to the picture to make it displayable. 

In order to track the combination of objea (and light) colors that are combined 
at each pixel, Glassner et al. rendered the image using ray tracing and saved the 
ray trees (we used one sample per pixel for this phase). From the ray tree we can 
build a symbolic expression representing how each color in the scene combined 
[390]. For example, the tree of Figure 20.12(a) corresponds to the expression of 
Figure 20.12(b). 

When the image has been completely rendered, the symbolic expressions are all 
stored in a text file. Then the actual colors of the scene are used to evaluate the 
expressions and pixel RGB values are calculated. The RGB values are compared 
against the output device gamut, and those pixels that are out of gamut are flagged, 
along with a real number representing the estimated distance to the nearest point on 
the gamut. 

The symbolic expressions that generated the out-of-gamut pixels are retrieved 
from the file, and symbolically differentiated with respect to each color. These 
differential equations tell the system how a given change in each scene color will 
affea the resulting pixel color. The result is a matrix of differential changes called 
the Jacobian. It tells us how each RGB component of each pixel will change, given 
a change in the surface properties of objects in the scene. 

The goal is to make the picture fit the gamut, so for each pixel we found the 
nearest point on the gamut, and we computed the difference between these two 
colors. This becomes the target for that pixel: the difference reveals the desired 
motion of that pixel to bring it into gamut. Recall that the differential equations 
computed at each pixel describe how the pixel moves with respea to changes in the 
scene color; this may be written as a matrix equation that relates scene color changes 
to pixel color changes. Inverting the matrix tells us how to change the scene colors 
to accomplish the desired changes in pixel colors. 

Typically all of the out-of-gamut pixels cannot be brought into gamut at once 
with the same set of scene color changes, so a best approximate change to the scene 
colors is computed and a small step is taken. A new image can then be immediately 
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(a) A ray tree for a pixel, (b) The corresponding symbolic expression. 
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generated by reevaluating the symbolic expressions; the expensive visibility and 
shading calculations performed by the ray-tracing step need not be repeated. Then 
the process repeats. 

The result is that we move through picture space coming ever closer to the 
gamut. It may appear that the method is doomed to produce a bad image, because it 
is trying to find scene parameters that will match the target image, which was created 
by simple projection to the gamut (and thus has all the artifacts of that technique). 
But this is not the case, because no combination of scene parameters can move us 
out of picture space. So no set of scene parameters will ever reach that projected 
image and its artifacts; the target is simply a goal that keeps us moving toward the 
gamut. 

As in the system by Kawai et al. [242], the designer may need to supply addi¬ 
tional manual constraints to make sure that the automatic solution is aesthetically 
acceptable. 

One interesting property of this method is that the same original scene will create 
different resulting images for displays with different gamuts: the image destined 
for a printer will appear different than the image created for CRT viewing. Of 
course, because their gamuts are different, all gamut-mapping methods will produce 
different images for different devices, but this method has the advantage that each 
picture maintains semantic consistency. 


20.4 Further Reading 

The paper by Tumblin and Rushmeier [442] gives an excellent overview of the tone 
reproduction problem, and its relationship to the visual system, film, and CRTs. The 
patent by Statt [419] offers another approach to color image transformation, taking 
into account viewer adaptation. Kajiya and Ullner have studied the problem of ideal 
reconstruction on real devices in some detail; their results are reported in a difficult 
but fascinating paper [238]. 


20.5 Exercise 

IXMtlM 20.1 

We assumed in Section 20.1 that we could build operators Vs” 1 , Vp” 1 , and V~ l . 
Discuss these operators and what they mean. 




If the man who paints only the tree, or flower, 
or other surface he sees before him were an 
artist, the king of artists would be the 
photographer. It is for the artist to do 
something beyond this; in portrait painting to 
put on canvas something more than the face the 
model wears for that one day; to paint the man, 
in short, as well as his features; in arrangement 
of colours to treat a flower as his key, not as his 
model. 

James Abbott McNeill Whistler 

(“The Gentle Art of Making Enemies/’ 1892) 


THE FUTURE 



To conclude the book I’d like to present my opinion of where the field is headed in 
the near and more distant future. This chapter is mostly speculative and contains 
just my opinions, which are not objective, eternal, or universal. 


21.1 Technical Progress 

There are several directions in which the field can move to greatly increase the 
power and utility of rendered images. Certainly as computers grow in speed and 
parallelization, rendering algorithms will benefit. It is a folklore theorem in computer 
graphics that all research images take a few hours, and all production images take 
about 10 minutes. This comes about from the fact that every time we invent a new 
technique to create images more quickly, we compensate by loading the Tenderer with 
more work (to take advantage of the freed-up time). So in general people will pick a 
level of complexity and accuracy that is consistent with the time they’re willing to wait 
for an image, which is the limiting factor. There are certainly important questions 
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to be addressed in terms of exploiting parallelism and analog computation, but I see 
those as generally responsive moves to changes in the hardware, rather than new 
directions that will be taken by image synthesists from their own impetus. 


21.1.1 Physical Optics 

Image synthesis to date has focused on the geometrical optics approximation to light 
transport that we emphasized in this book. Except for some shading models in 
Chapter 15, the theory of physical optics has largely been unexplored for graphics. 
Three notable exceptions are the papers by Bahar and Chakrabarti [25] and Moravec 
[314], and the thesis by Kochevar [248]. All of these approaches have been com¬ 
putationally expensive, which is probably why physical optics has not been more 
thoroughly explored for image synthesis. For some applications, it is important that 
we be able to model diffraction and interference. It would be useful to have the 
option to include these effects in a scene, even if at a significant increase in cost. 


21.1.2 YoIvm# Rondaring 

The field of the direct rendering of volumes has recently started to achieve more 
prominence within computer graphics. The annual Visualization and Siggraph con¬ 
ferences routinely contain a multitude of papers on the topic. Generally these papers 
address practical issues at the border between modeling and rendering: how to con¬ 
vert a 3D vector function into some sort of geometrical or material description that 
may be rendered. I have not emphasized these methods in this book because they 
are both very practical (as opposed to theoretical), and developing at a very rapid 
pace. 

The mathematics of volume rendering follow the FRE as described in Equa¬ 
tion 17.10. In volume rendering the volumetric absorption and scattering terms 
dominate, and the boundary conditions are sometimes omitted altogether. Because 
of the great quantity of data involved, volume rendering algorithms can be very slow; 
a lot of practical work has been directed toward making such techniques efficient. 

For example, volume rendering anti-aliasing methods can combine object-space 
filtering with screen-space filtering to find approximate but rapid filters. The “splat- 
ting” method due to Westover [476] takes such an approach; the method has been 
extended to hierarchical rendering [258] and texturing [106]. The design of explicit 
filters for 3D volume rendering has requirements similar to those for the 2D filters 
we’ve seen in this book, but some variations as well [74]. We can also explicitly low- 
pass filter the objects before rendering, creating a scene that is inherently anti-aliased 
[466]. 

Important applications of volume rendering are in fields where we physically 
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gather 3D data. Examples include geological exploration, simulated and measured 
fluid dynamics, and medical imaging. 

Medicine is a particularly important field, where results from CAT and MRI 
scanners are used to make diagnoses and plan treatment. Computer graphics may be 
used to check the fit of an artificial knee or hip, plan internal surgery, or suggest the 
results of reconstructive surgery. The use of computer graphics to improve medical 
science is, I believe, one of its most significant applications. 

I believe that volume rendering sits at one of the many crossroads between ren¬ 
dering and modeling. Volume rendering methods must accommodate enormous 
amounts of data and present it in an intelligible way. I believe that we will find 
a variety of new methods that are appropriate for all rendering coming from this 
subfield. 


21.1.3 Information TH+ory 

The theory in this book has relied heavily on the ideas and tools of signal processing. 
A complementary field is that of information theory , which examines how well we 
can communicate a specific sequence of symbols from one place to another (or, 
when used for storage, from one time to another). The information-theoretic view 
tries to make sure that any errors introduced into a message can be detected, and 
perhaps even corrected upon receipt. A large number of error-correcting codes 
and transmission protocols have been developed to accomplish this. An excellent 
introduction to the subject may be found in the book by Hamming [184]; a more 
detailed yet still accessible description is provided by Ash [19]. 

When a photon is emitted from a light source and then strikes an object, that 
photon has effected the transfer of some information . Minimally, it represents that 
a certain amount of energy of a specific quantity and quality has been transferred 
from one object to the other. But as we have seen, it also tells us something about the 
relative visibility of the two points, and the amount of impact that light source will 
have on the final image. The use of importance to guide the rendering of a scene can 
be thought of as a first application of information-theoretic ideas to rendering theory. 
To see this, consider that a point on a surface communicates light information to 
the image; this communication takes place along a channel (in our case, the physical 
channel is the air or environment through which the light travels). The importance 
at the point can be imagined to describe the size of this channel; a point with low 
importance sends back only a small amount of information, and thus needs only a 
small channel. The tools of information theory can be applied to this problem to 
design transmission codes that carry this information effectively and compactly. 

This is rather speculative right now, since there has not been much attention paid 
to applying information theory to image synthesis. I think it holds promise, though, 
and may help us design new types of efficient rendering algorithms. 



1 076 


21 THE FUTURE 


21.1.4 Beyond Photo-Realism* Subjective Rendering 

Photo-realism has served as a fruitful target for image synthesis research: to pro¬ 
duce images that look like photographs, we have had to develop the theory and 
techniques described in this book. But now that we understand how to make im¬ 
ages with computers, we need not be bound to the simulation of everyday reality. 
The history of art is often described in terms of movements or schools , in which 
certain groups of artists have together explored particular ways of representing the 
world. Impressionism, expressionism, and minimalism are some modern examples 
of such movements. Photo-realism is a relatively recent newcomer; in fact, the term 
“photo-realism” was only coined in 1968 [297]. 

Computer graphics is getting very close to producing photo-realistic images for 
some simple scenes. An experimental comparison of a real and synthetic image of 
a very simple environment showed that the match was quite good [301]. As our 
algorithms become more efficient and include ever more subtle phenomena, and our 
models more complex and detailed, the match of our simulations to the real world 
will become better and better. 

Three-dimensional image synthesis should extend itself into these other realms. 
The books by Gombrich [164] and Shlain [403] provide excellent descriptions of art 
as a captured perception of the world, not always a mirror of it. We should think 
of the computer not only as a mirror for reality, but as a window into new realities, 
described by new physical laws and new ways of seeing things. 

The emphasis in image synthesis has been to include ever more detail and precision 
into the image. We can find popular and effective media that have taken exactly the 
opposite direction; comics is one example. The popular art form of comics exploits 
simplicity and abstraction to get its message across [293]. Comics has a rich visual 
language for representing different types of mood and action [135], which in fact 
can be ruined by too much realism [293], 

Complexity , speed , and accuracy have been the driving forces behind the devel¬ 
opment of new rendering algorithms; it’s time to add expression. 

Part of the creation of a piece of art involves deciding what will be included and 
excluded from the work, and how each entity in the work will be presented, both in 
composition and style. Image synthesis until now has treated everything in a scene as 
equally important and rendered with equal precision. Now that we understand the 
general approach to 3D image making with computers, we should begin to consider 
other styles of rendering and presentation. The accurate presentation of physical 
phenomena will remain an important application of computer graphics, but there 
is an entire world of emotional and spiritual visual communication that has so far 
been largely ignored. 

Subjective rendering is the name I use for techniques that allow a designer to create 
a 3D image that includes not only accuracy and simulation, but also a meaning 
imposed by the designer that influences how objects are treated and the image is 
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rendered. I believe this subject is a rich and rewarding area for future image synthesis 
research. 


21.2 Other Directions 

In addition to the medical and subjective rendering methods mentioned above, there 
are many directions in which image synthesis can lead, and many places from which 
it can draw inspiration. In this section I will describe some of the sources and uses 
that I foresee; there are certainly many others. 

With increasingly powerful and inexpensive technology, the portable encyclope¬ 
dia may make the transition from science fiction to everyday tool. Devices containing 
complex interactive 3D databases could revolutionize architecture books, allowing 
us to walk around a building, walk inside at different times of day and see where the 
shadows fall, and watch the sun rise and set at different times of the year. Engineering 
and mathematics texts could greatly benefit from animated presentations of dynamic 
systems. Biology, geology, and the other natural sciences could contain animated 
developments of complex ideas or time-evolving phenomena. The arts can benefit as 
well, offering the student captured performances, demonstrations, and instruction 
in technique, and even simply recording significant examples of artwork from mu¬ 
seums and collections of all sorts. Many topics currently illustrated in dictionaries 
and encyclopedias with drawings and photographs could instead use precomputed 
animations or interactive simulations. The sheer glitz of such possibilities is likely to 
overwhelm their utility in the beginning, leading to useless or poorly produced docu¬ 
ments, but as they become more common, the standards for quality and information 
content will rise. 

In this book we have considered 3D surfaces in a world governed by classical 
physics. There’s no reason to be restricted to either of these worlds. 

The idea of a “fourth dimension” has captured people’s imagination since H. G. 
Wells used it as the basis for his classic story on time travel, The Time Machine , 
in 1895 [474]. The idea of a fourth dimension was given a more scientific basis 
in the development of the theory of relativity [388]. It is still a fascinating idea to 
think about, as thoroughly explored by Rucker in his book on the subject [364]. To 
look into four (or more) dimensions requires a feat of mental visual skill that most 
people don’t seem to have. But we can interpret 3D projections of these objects. In 
Abbott’s classic book Flatland [1], residents of a 2D world interpret 3D objects by 
the projection of those objects onto the plane in which they live. Burger carried this 
one step further in Sphereland [70], where 3D creatures like us interpret 4D objects. 

Techniques for actually making these projections in a way that is useful need to 
be developed. There has been some recent progress in this direction. The book 
by Banchoff [28] offers an excellent introduction to the subject and a number of 
methods for looking at 4D objects. More recently, Hanson has developed direct- 
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viewing methods for 4D objects using a careful combination of illumination and 
transparency [193,194]. 

We know from relativity that light deviates from a straight path in the presence 
of mass; in fact, it follows a geodesic in 4D space-time [306]. Viewed from our 3D 
space, the light appears to follow a curved path. We would expect this to affect our 
visual world, in the same way that mirages redirect light from one path to another. 
Indeed this is the case, and this visual distortion can be modeled. The work of Hsiung 
and Dunn [216] and Yamashita [495] presented methods for this visualization, which 
can be extended to include the color shift predicted by the physics [217]. 

From the early days of television, people have predicted a form of real-time 3D 
video, rather than the flat screen used today. Usually people imagine a “free-viewing” 
display, meaning that any number of people can see the image at the same time, all 
free of any additional viewing technology (such as special glasses). 

Such a display continues to elude us, but a variety of alternatives are variously 
available, plausible, or simply interesting. 

One prominent method that is currently available is a number of different stereo 
display technologies. Such systems almost always generate two images, and either 
directly or indirectly route one image to each eye. Like the random-dot stereograms 
of Chapter 1, the human visual system manages to combine these two independent 
views into a single whole. On a very different front, we can use the techniques of 
stereolithography or computer-directed machining to actually take a 3D mathemati¬ 
cal surface description and turn it into a real 3D object. There are limitations to the 
types of shapes such machines can make, but geometers have long known that when 
you have a real object in your hand you can use your tactile and motor-memory skills 
to augment your visual perception of the shape to improve your understanding of 
it. Computer-generated holograms are becoming easier to make [179], and a variety 
of other 3D display technologies are discussed in McAllister’s book on the subject 
[290]. 

Certainly there are many wild possibilities for display systems if we are willing 
to consider new technologies. The glow of a firefly can be recreated with manmade 
chemicals, which can be purchased prepackaged in plastic tubes in automotive supply 
stores. Inside the tube is a liquid (sold by Kodak under the name Luminol) within 
which is a smaller glass vial containing a second liquid. Bending the glass tube 
breaks the internal vial, and then shaking the tube mixes the two liquids. The result 
is a greenish yellow phosphorescent glow that can last for several hours and be 
used as an emergency light or warning beacon. Similar compounds can be bought 
at many public gatherings premixed in thin, flexible tubes that may be used as 
bracelets. Different choices for the activating chemical produce different colors of 
light, including red, green, and blue. 

Imagine a system of airbrushes, where each airbrush is under computer control 
to specify the atomized liquid it projects, as in Figure 21.1. The computer controls 
not only the direction of the spray but also the amount of liquid sprayed and its 
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FIOURI 21.1 

A schematic view of the color cloud chamber. 


spread. Suppose there were several of these atomizers spaced around the inside of 
an enclosure, each spraying a mist of one chemical. Several other atomizers are 
distributed in the enclosure, and each one of these emits a spray that upon mixing 
with the other chemical causes a glowing red, green, or blue cloud. By careful color 
mixing we can achieve any combination of these colors, just as on a CRT. There are 
some practical issues to be addressed in such a system, but I particularly like the idea 
of glowing clouds of light. 

There are many exciting applications of synthetic images. As I mentioned before, 
I believe that some of the most important are in medicine. In addition to the planning 
and study uses mentioned earlier, synthetic imagery could be used to help those who 
have poor or no eyesight. Depending on where the problem is located, we can 
imagine a day when we understand the signal that is conveyed along the optic fiber 
from the eyes to the brain well enough to predict the signal for a given visual stimulus. 
Then we could take the visual stimulus, either real or synthetic, convert it into this 
form, and then directly apply it to the optic fiber. We could in theory work our 
way ever higher up the visual pathway until we were able to provide direct cortical 
stimulation to induce the sensation of “seeing” something. 
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This chain of thought leads to the uses of computer graphics in “virtual reality.” 
Currently two approaches dominate: a user wears a helmet that blocks out all 
visual (and perhaps aural) stimulation except that provided by the computer, and a 
partial helmet that superimposes computer-generated imagery between the eye and 
the world. 

The former application may find value for completely imaginary worlds or worlds 
at a great distance: for example, navigating through a computer file hierarchy or 
driving a Mars rover from Earth. In a fully controlled environment we can experience 
synesthesia , where senses appear to be cross-coupled, allowing us to feel yellow, taste 
loudness, or look at the sensation of sour. 

The superimposed display can provide some externalization of memory, much 
as writing extended memory. The computer can provide a symbolically augmented 
world, which enhances our understanding of the natural objects around us. We can 
simply look around a room and, with the appropriate software, determine how long 
the lights have been on, or which items in the room were made in a given country, 
or which were gifts from a particular person. More complex relationships can be 
established and demonstrated explicitly or symbolically: for example, we could draw 
connecting lines between all objects that contain information regarding a particular 
project, or simply highlight them in the visual world. 

I believe there is a danger in simply following the technology without considering 
carefully what impact it has on the person using it and society as a whole. The 
complete simulation of the natural world may suggest that our bodies can somehow 
be left behind, and simply maintained at some minimal level while our brains are 
stimulated by computer-generated information. I believe that a human being is a 
tightly interconnected system of mind and body, and neither can be ignored with¬ 
out affecting the other. Simulated sensory perception does not replace the direct 
experience of external reality, metaphysical precision aside. I believe that reducing 
the importance of our bodies on our experience (or simply paying less attention to 
them) may take us farther from the important questions of life that have occupied 
humankind throughout history. I believe that we need both mind and body working 
together to experience the totality of the human experience, something that cannot 
be captured by either a remote-controlled robot or a disembodied computer pro¬ 
gram. Human beings are a combined mind-body system, and I believe both sides of 
our natures must be honored. 


21.3 Summary 

I’m happy to say that rendering systems are becoming more of a commodity item. 
People are now treating rendering systems like calculators: complex systems that can 
be guided with a few controls to accomplish a task reliably and predictably. We’re 
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not yet at the same level of ease that is provided by the calculator, but we’re getting 
there. 

I hope image synthesis can be a positive force in the world. I hope people find 
this another medium for creative self-expression, for sharing ideas and thoughts, for 
helping bring about more communication between people, and for growing, teaching, 
learning, and experiencing joy in the remarkable, vibrant universe in which we live. 





APPENDICES 


Abbreviators do harm to knowledge and to love, 
seeing that the love of any thing is the offspring of 
this knowledge, the love being the more fervent in 
proportion as the knowledge is more certain. And 
this certainty is born of a complete knowledge of 
all the parts, which, when combined, compose the 
totality of the thing which ought to be loved. 


Leonardo da Vinci 





In order to deviate successfully, one has to have 
at least a passing acquaintance with whatever 
norm one expects to deviate from . 

Frank Zappa 

{“The Real Frank Zappa Book,” 1989) 



LINEAR ALGEBRA 


A. 1 General Netatien 

A number of typographical conventions are used in this book to simplify formulas 
and identify algebraic objects and useful phrases. Table A.l gives a summary. 


A.2 Linear Spaces 

A vector space , or linear space , is a combination of a set X and two operators, 
commonly called addition and scalar multiplication . Addition of two elements 
x,y £ X is written x + y, and scalar multiplication by some (possibly complex) 
factor a e C is written ax. A vector space is algebraically closed under these 
operators; that is, (x + y) € X and ax € X for all x, y € X and a £ C. The 
dimension of the space is the largest number of linearly independent elements. 

A particularly useful linear space is the space of linear functions. Consider the 
set of all functions A: Tl —> TZ. Define addition and scalar multiplication pointwise, 
so that for two functions f,g £ A , we write (/-F g){x) = f(x) + g(x) and (af){x) = 
a f(x). If two functions / and g satisfy these equations, they satisfy the requirements 
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Notation 

Meaning 

Z 

The integers 

C 

The complex numbers 

n 

The real numbers 

(o.b) 

The interval from a to 6, excluding both 

[o,6] 

The interval from a to 6, including both 

r 

An interval of reals 

V 

For all 

3 

There exists 


If and only if 

iff 

If and only if 


TABLI A • 1 

Notation for algebraic objects and useful phrases. 


for a linear space. In this appendix we will speak variously of vectors or functions 
as elements of a linear space, depending on context. 


A*2.1 Norw 

A norm is generally intended to measure the size of some object. A norm for a 
function x(t) 9 written ||x||, is a function to the reals that satisfies four requirements: 

11*11 > 0 

||x|| = 0 o- x = 0 

IMI = M 11*11. 

II*+ vll< 11*11+ llvll (A.l) 


The L p norms are a family of norms that can be defined for functions. The L p 
norm for a function x(t) is written ||ar|| p ; it is given by 


ll*ll P = 



(A.2) 


The Lu L 2 , and L norms are the most common, and the only ones used in this 
book. The L 2 norm is often called the root-mean-square (or RMS) norm, and the Loo 
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norm is called the Tchebyshev norm. They are given by plugging in the appropriate 
value for p into Equation A.2, giving 


Mi 



dt 


IWI 2 

ML 



= lim ||x|| = max a <t< b\x(t)\ 

V—tOC P 




A linear space together with a norm is called a normed linear space . 


A.2.8 Ini and Sap 

If S is a set or sequence (finite or infinite) of real numbers, then inf S is the infimum , 
or greatest lower bound, of S : the largest real number r such that r < s for all s € S 
(or —oo if S is unbounded below). Similarly, sup S is the supremum , or least upper 
bound: the smallest r such that r > s for all s € S. 

A related but different pair of terms is lim inf and lim sup . These apply only to 
sequences, not to sets. Suppose 5 is a sequence of elements s n : {si, $ 2 , • • •}• Then 
lim inf S is the limit of s n as n —► oo; that is, the largest real number r such that 
s n > r for all sufficiently large n . Similarly, lim sup S is the smallest real number r 
such that r > s n . 

If the sequence S has a limit, the lim sup and lim inf are both equal to the limit. 
Every sequence has a lim sup and a lim inf whether it has a limit or not. For example, 
the sequence { 1 , — 1 , 1 , — 1 , ...} has lim inf S = — 1 and lim sup = + 1 ; the sequence 
and {1,1/2,1/3,1/4,...} has lim inf = 0 and lim sup = +oo. 

If 5 is a finite set of numbers, then inf 5 = min(S), and sup 5 = max(S). If 
S — Z (the set of all integers), then inf S = -oo, and sup 5 = +oo. If S is empty, 
then by convention inf S is taken to be 4*oo. 


A.8.3 Metrics 

A metric is considered some measure of the distance between two objects. A metric 
d(x, y) between two objects x and y must satisfy four requirements: 

d(x , y) > 0 
d(x, y) = 0 <=> x = y 
d{x,y) = d{y,x) 

d(x , z) < d(x, y) + d(y , 2 ), z € C 2 


(A.4) 
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The metric may be defined in terms of some norm: 


d{x,y) = \\x-y\\ 


(A .5) 


A.8.4 Completeness 

Suppose we have some sequence of elements in a space X: (#i, x 2 , # 3 ,.. .). We 
can imagine using a norm in that space to find their distance from some element x. 
Suppose that in the limit the elements of this sequence get ever closer to x: 

lim \\x n — ar|| = 0 (A. 6 ) 

71 —>00 

Then we say that the sequence converges to x. A vector space X is complete if the 
limit element for every convergent sequence is also in the space. A complete vector 
space is called a Banach space. 

We call a sequence a Cauchy sequence if 

sup \\x n — XmW —> 0 for N -> 00 (A.7) 

m,n>N 

which says that as we go further into the sequence, the terms are closer and closer 
together. 

All of these notions of convergence and completeness are based on the norm being 
used at the time. A sequence in an infinite-dimensional linear space that is convergent 
under one norm may not be convergent under another. However, /m/te-dimensional 
linear spaces possess two very useful properties: 

1 If a sequence in such a space is convergent under some norm, it is convergent 
under all norms. 

2 Every normed finite-dimensional space is complete. 

These properties are of great utility, because although computer graphics deals 
with problems posed in infinite-dimensional spaces of continuous functions, when 
represented on a computer all of our algorithms necessarily deal with finite¬ 
dimensional spaces of discrete functions. 


A.8.5 Inner Products 

The inner product (or dot product) tells us something about how one element projects 
onto another. In this book, we take the inner product of two vectors with the bra-ket 
notation, combining a bra (/| with a ket |#) to form the inner product, (/| g). 
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An inner product must satisfy four requirements: 

(x\y) = (y\x) 

(ax i + bx 2 \y) = a(x 1 \y) + b(x 2 \y) 

(x\ x) > 0 

(x\x)=0 <=> x = 0 (A.8) 

The braket lets us take the inner product of two continuous functions / and g : 

{f\g) = JW)g(t)dt (A.9) 

This definition can be easily shown to satisfy the four requirements listed above. 
When no domain of integration is explicitly listed, the domain of the first function 
is implied. 

It is sometimes useful to generalize this definition with a weighting function w(t) 
that gives different impact to different pieces of the two functions involved. So the 
full inner product would be written 

(/Is) = J f(t)g(t)w(t)dt (A.10) 

The form we use most often in this book sets w(t) = 1. 

If we have an inner product for a space, we can always define a norm by 

INI = y/&\x) (A.ll) 

We can show that this definition satisfies the requirements of a norm almost imme¬ 
diately from the definition of the inner product. 

Not all norms can be written as a function of the inner product in some space. If 
we indeed have a Banach space with a norm derived from an inner product, we call 
that a Hilbert space . Familiar Euclidean rectilinear space is a Hilbert space. 

This cumulative addition of structure to derive a series of spaces is summarized 
in Table A.2. 

The inner product gives rise to the idea of perpendicular or orthogonal elements. 
We say two elements x,y G X are orthogonal, and write x _L y, if and only if 
(x\y) = 0. By extension, for some set S', x _L S if (x| s) = 0, Vs G S. 

A useful result of this property is that we can represent one element in terms of its 
projections onto other elements. In Euclidean space this is represented by projecting 
a vector onto the three principal axes, and then adding scaled versions of those 
axes back together again to retrieve the original vector. In general, we can write an 
element x in terms of n other suitable elements yi by 

n 

x = '52( x \yi)vi 

i= 1 


(A.12) 
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Cumulative components 

Space name 

Set S 

Addition operator 

Scalar multiplication operator 

Linear space 

x + y € X Vx, y e X 

axe X Va ec,x e x 

Complete linear space 

Norm ||x|| 

Banach space 

(if not complete, a normed linear space) 

Inner product (f\g) 

with derived norm 

Hilbert space 


TABLI A.a 

The hierarchy of linear spaces. 


where the set of functions {&} are the algebraic duals of the set of bases {t/*}. 


A.3 Function Spaces 


An important class of linear spaces are those occupied by functions. 

Classes of functions that satisfy a particular criterion are said to make up a 
function space . The most important function space in this book is called the function 
space C 2 ; it is made up of all functions x(t) that satisfy 


/ 


x(t)\ 2 dt < oo 


(A.13) 


Such a function x(t) is called square-integrable . A smaller space is denoted £ 2 (a, 6), 
which is the class of functions that are square-integrable on the interval (a, 6); that 
is, 


/ 


\x(t)\ 2 dt < oo 


(A.14) 


Two functions x(£) and y(t) in C 2 that are equal for all but a finite number of 
values of t are said to be equivalent , or equal for almost all values of t . So if two 
functions x(t),y(t) £ C 2 are equivalent, they satisfy 


/ 


[x(£) - y{t)\ 2 dt = 0 


(A.15) 
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A related set of spaces is C l and C l (a , 6), which satisfy 



dt < oo 


for domains [—oo, +oo] and (a, 6), respectively. 


(A.16) 


A.4 Further Reading 

This appendix barely hints at the rich structure of linear spaces. This is a field of 
study that has reached great depths and reveals elegant abstract relationships; it 
rewards careful study. Detailed introductions to this material may be found in most 
linear algebra texts, such as Strang’s book [421]. Good starting points up to and 
including Hilbert spaces include the books by Young [500] and Berberian [39]. Some 
discussions intended to set the stage for particular topics are available, for example, 
integral equations in the books by Hoheisel [213] and Delves and Mohamed [120], 
and real and functional analysis in the books by Royden [362] and Rudin [365]. 

A good discussion of states, probabilities, and the linearity of state space in terms 
of quantum mechanics is given by Sudbery in his book [427]. 




Each topo map is a mosaic of errors, such as 
one section drawn 10% too large , while the 
adjacent one , in compensation, is drawn 10% 
too small. So, over distances of one to several 
miles , my figures can be up to 10% off 
However, over longer distances 5 the map s 
errors balance out T and so do my mileages. 

Jeffrey P. Schaffer 
(“Hiking the Big Sur Country: 

The Ventana Wilderness,” 1988) 



PROBABILITY 


A full exposition of statistics and probability is beyond the scope of this book. The 
reader who has not seen this information before or wants more depth should consult 
one of the many excellent texts on the subject, listed in the Further Reading section. 
Our survey will follow the work of Papoulis [331], Spanier and Gelbard [415], and 
Hammersley and Handscomb [183]. We start with events and then discuss total 
probability, repeated trials, random variables, measures, and distributions. 


B.l Events and Probability 

In this section we will define some basic terms from probability theory and statistics. 
There are many subtleties and variations in the definitions that we will not explore, 
in favor of a rapid outline of the most relevant ideas. Much more detail on all of 
these topics may be found in the references. 

In probability we often speak of an experiment . This is meant to stand for any 
process that returns a result. Thus an experiment can involve tossing a coin, or run¬ 
ning a computer program to determine the first object intersected by a ray. Typically 
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experiments are nondeterministic; that is, they will return one of many (perhaps 
infinite) answers, where each answer has a particular likelihood of occurring. When 
discussing outcomes of experiments, we usually group them into sets and then use 
common set operations such as union, intersection, and difference to compare sets 
and build new ones. 

The canonical example of elementary probability is the die-throwing experiment. 
We suppose we have a fair die : a cube with the six integers from 1 to 6 painted on 
its faces, one number per face. The result of an experiment is the number showing 
on the top of the die after it has been thrown. Since it’s a perfect cube, if we throw 
the die after sufficient shaking, no face is any more likely to end up on top than any 
other. When we refer to the “die-throwing experiment” below, we mean one fair 
throw of one fair die, so that each of the six faces has an equal likelihood of ending 
up on top. 

Another experiment we will use involves bird spotting. Suppose we take a walk 
in a forest, and we classify the various birds we see in terms of their age and size. 
We will consider each walk in the woods one experiment, whose outcome may be a 
description of several birds, only one, or none at all. 

We will call each possible set of results of an experiment an event , and we will 
use italic capital letters (A, £, C, or more generally, A*) to represent events in this 
appendix. The term “event” should be distinguished from “result”; a single event 
may include many results. For example, we might speak of “a 9-inch, 3-year-old 
bird” as an event, or “any 6-month old bird” as an event. An event includes any 
number of possible results of the experiment: none, one, or many. If the outcome 
of an experiment matches any element in an event, we say that event has occurred . 
Thus one experiment has only one result, but it could satisfy several events. 

We use St to denote the set of all possible outcomes of the experiment. Typically 
SI is called the certain event , meaning that because it contains all possible results, an 
element of SI is certain to contain the result of any run of the experiment. The set of 
no results is the empty set 0. 

Suppose that in our bird-watching experiment, we have the following events 
containing (age, size) pairs (the units are arbitrary; years and inches are an example). 
These events are sets of possible experimental outcomes: 

Ai = {(2,6), (3,1), (1,4)} 

A 2 = {(1,5), (3,1)} 

As = {(2,2), (4,3)} (B.l) 

The two sets A 2 and A3 are said to be mutually exclusive , since they share no 
common elements. Sets A\ and A 2 are not exclusive. Formally, a list of events is 
exclusive if for any i ± j, Ai fl Aj = 0. 

Turning now to the die-throwing experiment, we might have the following three 
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sets of possible rolls: 


A\ = {1,2,4,5} 

A 2 = {3} 

^ 3 = {1,3,6} (B.2) 

These sets are not exclusive, in pairs or all together. However, taken together they 
represent all the possible outcomes of the experiment. We call such a collection of 
events exhaustive . Formally, a set of events Ai is exhaustive if (J i Ai = Cl. 

A particular set of events may be exclusive, exhaustive, neither, or both. 

A probability is a quantitative measure expressing our expectation that a given 
event will occur. Probability is a number between 0 and 1 inclusive; a value of 0 
means that in practice that event will never occui^ while a value of 1 means that it 
will certainly occur. We write the probability of event A as P(A). The probability is 
defined as a real function that satisfies 


P(0) = 0 

P(d) = 1 

0 < P(A) <1 for all A (B.3) 

Several combinations of events will prove useful. The probability of two events 
A and B occurring is written P(AB). The probability of one of two events A or 
B occurring is written P(A + B). Finally, we can write the conditional probability 
of A given B, P(A|B), which is the probability that A will occur given that B has 
occurred. 

Probabilities combine according to two basic laws: 

P(A + B + C +...) <P(A) + P(B) + P(C , ) + --- 

P{AB) = P(A\B)P(B) (B.4) 

If the events are exclusive, then the inequality in the first equation becomes exact 
equality. 

Two events are called independent if 

P{AB) = P(A)P(B) (B.5) 

If events A and B are independent, then P(A\B) = P(A). 


B.2 Total Probability 

Suppose that we have n exclusive and exhaustive events A *. For example, we might 
have an experiment that returns points in the unit circle. If we subdivide the circle 
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MBUtl B.1 

The different wedges Wi tile the circle. Another event B is somewhere in the circle. 


into n wedges W *, as in Figure B.l, then each Ai can correspond to wedge Wi . 
For any arbitrary event B where P(B) ^ 0, we can write the theorem on total 
probability : 

P(B) = P(B\A l )P(A l ) + • • • + P(B\A n )P(A n ) (B.6) 

In words, this theorem says that the probability of an event B occurring is equal 
to the sum of n different probabilities, where n is the number of different events 
associated with the events A. If the A n are exhaustive, then we know that in a given 
experiment exactly one of them must occur. If they are exclusive, then we know that 

one and only one of them will occur. So 1 = P(Ai) H-h P(A n ). The first term on 

the right-hand side of the theorem is the probability of B occurring given that Ai has 
occurred, times the probability that A\ actually did occur. For any given experiment, 
only one of the Ai will be satisfied; suppose it’s A q . Then for that experiment (since 
we already know the outcome), P(Ai) = 0, i ^ q, and P{A q ) = 1. Thus the right- 
hand side reduces to P(B\A q )P(A q ) = P(B\A q ) = P{B). The last simplification 
comes about because we know that A q actually did occur. The theorem is important 
because it allows us to express the probability that an event will happen in terms of 
conditional probabilities for that event and probabilities of other events; often we 
know this information. 

To prove the theorem, think of the different events as exactly corresponding to 
their regions; thus, the event A 2 + ^3 corresponds to the region W 2 U W3, and A 2 A$ 


I 
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corresponds to W 2 fl W 3 . Then because the A{ are exhaustive, their union is the 
complete circle, and the intersection of that with B is simply B : 

B = B(A\ + • • • 4- A n ) = BA\ + • • • + BA n (B*7) 

Because the A{ are exclusive, then the regions BA{ don’t overlap, and from the first 
line of Equation B.4 we can say 

P(B) = P(BAi) + • • • + P(BA n ) (B.8) 

Equation B.4 also says that P(BAi) = P(B\Ai)P(A i ) i and combining this with 
Equation B.8 gives us Equation B.6. The theorem on total probability is useful for 
finding the P(B) in terms of the P(B\Ai) and the P{Ai). 


B»3 Repeated Trials 

Suppose we repeat an experiment n times, and we want to know if a particular 
event occurs k times. For example, in the die-throwing experiment we might want 
to know how many times the die will come up with an even number. Thus we have 
two outcomes, represented by the sets A^ = {1,3,5} and i4 eve n = {2,4,6}. 

We begin by recalling from combinatorial analysis that if we are given n objects, 
and we are asked to form k distinct sets, then there are N n (k) sets, where 

- (I) “ ,B ' 9) 

Here a set is an unordered collection of objects, where two sets are distinct if they 
differ by at least one element. Thus two groups X and Y are different if (X U Y) - 

(xn Y)? 0 . 

We will use this result to solve a useful problem: suppose we have an experiment 
with two possible outcomes, a and 6, and we know the probabilities for each. What 
is the probability that we will get k outcomes of type a in n runs of the experiment? 

We start by using the result above to answer a simpler question, and then building. 
Suppose we have an ordered collection of n objects, where each object is one of two 
types, type a and type 6. We could write this ordered collection as a string of n 
characters, a and 6. If there are k objects of type a, then there are n — k objects of 
type 6. How many different n-letter strings of a and b can we write, such that each 
string contains k instances of a? 

Suppose we associate the n positions in the string with the integers 1 through n 
inclusive, and we make a set of the indices where each a occurs; this will be a set of 
k integers. For example, for n = 8 and k = 4, the string abaabbab would give us 
the set {1,3,4,7}. Our problem now is to count how many subsets of k integers can 
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be formed from a set of n integers. We have transformed our problem into a form 
that fits our result from above, and we observe that there are exactly N n (k) different 
strings of n symbols that contain exactly k instances of a . 

We will use this result to find the probability of a certain number of events in 
repeated trials. Suppose we repeat an experiment n times, and we are looking for an 
event A that occurs with probability P(A). We want to find p n (k ), the probability 
that A occurs k times. 

Suppose that P(A) = p, P{A) = q , and p + q = 1 , where A is the set of all 
outcomes that are not A. We write the result of experiment i as B *, which is either 
A or A . The result of the n experiments is then the sequence {B\, • • •, B n }. The 
probability of getting k events of type a in a specific order is given by 

P l (B l )P 2 (B 2 ) ■ ■ ■ P n (B n ) = p k q n ~ k (B.10) 

where Pi{Bi) is the probability that B{ is of type A. Thus the probability that A 
will occur k times in a specific order is given by p k q n ~ k . Now we want to know the 
probability that k events of type A will occur in any order; this is simply the sum of 
all the different ways that those k events can occur, which we found above is N n (k). 
These N n (k) different orders are mutually exclusive, so 

p„(fc) = P(A occurs k times) = (j^jp k ^ nk (B.ll) 

This formula is also known as the result of multiple, independent Bernoulli trials for 
two outcomes. 

If there are more than two outcomes, Equation B.l 1 generalizes directly. Suppose 
we have m exclusive and exhaustive outcomes A^ each with a probability p m (thus 

pi H- Vpm = 1)- Then the probability that event A\ occurs k\ times, and event 

A 2 occurs &2 times, and event A$ occurs k$ times, and so on, is given by 

n! 

Pn(k li ^2? • • • > km— 1 ? k m ) = - — -pi 1 P2 2 • • • Pm-l m 1 Pm m 

K2‘ ' ’ K m —i ! K m l 

(B.12) 


B.4 Random Variables 

In the section above we discussed associating a probability with each possible out¬ 
come of an experiment. Suppose that we have a set of Ai events that are exhaustive 
and exclusive, each with an associated value 77 > 0. Then we say that 77 is a random 
variable (often abbreviated r.v.). Associated with the random variable is a distribu¬ 
tion function F(y). This function is defined to give the probability that the event 
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that occurs has an associated value which is no greater than y. This function is 
sometimes called the cumulative distribution function. In symbols, 

F{y) = P{v < y) (B. 13 ) 

So this is the probability that any event will occur that has an associated value 77 less 
than y. If we sort all the events Ai by their associated 77, then F(y) is satisfied by all 
the events A{ for which 77 < y. We note that F(— 00) = 0 , F(+oo) = 1, and that F 
is nondecreasing with y. 

Associated with each distribution function is a density function. Typically the two 
are notated by the same letter, a capital for the distribution function and lowercase 
for the density function. When the distribution function is continuous, the density 
function may be written as its derivative: f(y) = dF(y) dy. In this situation, the 
density function f(y) may be thought of as the likelihood that the random variable 
will take on the value y. 

In the die-throwing experiment, we have six outcomes: fi = {A \,..., Aq}. These 
six events are exhaustive and exclusive; all six possible results are accounted for, 
and only one event will describe the result of any experiment. Since there are 
six possibilities with equal likelihoods, we attach equal probabilities to the events: 
p(Ai) = 1 / 6,1 < i < 6. The set As U A4 corresponds to our throwing either a 3 or 
4 . We will now define a random variable £ on Q by creating a real-valued function 
that associates a number with each event: £(A*) = i, 1 < i < 6. So for every possible 
outcome of the experiment, we have an associated number, which is the purpose of 
creating a random variable. 

We will often deal with experiments where two or more random variables are 
observed simultaneously. For example, we might throw a pair of dice, or we might 
note the age and size of a bird. Suppose these two random variables are 771 and 772, 
and we are interested in determining if rji < si while simultaneously 772 < 52 • We 
can write the joint distribution function F on 771 and 772: 

F(si , s 2 ) = Pfai < 5i, 772 < s 2 ) (B. 14 ) 

Suppose the associated 2 D density function / is continuous. Then we can write 
the joint cumulative distribution function as a sum of all the densities up to a given 
point (si, s 2 ): 

/ V fX 

/ f(u,v)dudv (B. 15 ) 

-00 J —00 

For a given value (si, s 2 )^ this tells us the probability that an experiment will have 
an outcome where the two observed variables satisfy 771 < s\ and rfc < s 2 simulta¬ 
neously. For example, this is the probability that we will spot a bird that is less than 
2 years old and smaller than 8 inches in length. The joint cumulative distribution 
function is familiar to anyone who has used sum tables [111] to help out with texture 
mapping. 
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Suppose we now ask about the distribution of 771 without regard for the state of 
772. For example, we might simply ask how many birds are less than 2 years old, 
whatever their size. We could write 


F\(si) = P(m < s 1 ) = 



f(u , v ) du 


dv 


(B. 16 ) 


We call Fi the marginal distribution of F with respect to 771. Similarly, we can write 
the marginal density of 771 as 


fiM 



f(si,v) dv 


(B.l 7 ) 


The marginal distribution (or density) can be thought of as the projection of a 
distribution (or density) along one or more dimensions. 

Let’s now posit a function <7(77), and ask for its mean value , also called its expec- 
tatioji , written E\g], The expected value of g is defined by 

E[g) = j g(y)dF(y) (B. 18 ) 

There are several interpretations of Equation B. 18 , depending on how general we 
want to be about what functions g should be permitted, and what conditions need 
to be placed on F(y) to create a meaningful definition for dF(y). In practice, two 
interpretations prove most useful: the continuous and discrete. If we think of F(y) 
as a function with derivative /(t/), then we can write Equation B .18 as 

E[g} = J g(y)f(y)dy (B. 19 ) 

If F(y) is pieced together from flat segments with heights fi at y *, then we can write 
Equation B .18 as 

E[g} = Y,9(yi)f(yi) (B.20) 

i 

Intuitively, the expected value for g is the weighted average of the values g takes on. 
The weights are given by the probability associated with each value of 77. In other 
words, when a result 770 is likely to occur, (7(770) will be multiplied by a relatively 
large value, and when 770 is unlikely, g(rjo) will be scaled down. The sum (or integral) 
of these weighted values gives us the most likely result to be returned by g over 
all possible values of 77. The values f(y) and fi in Equations B .19 and B .20 are 
the frequency functions of the random variable 77. The function f(y) is called the 
probability density function , or, as mentioned earlier, sometimes simply the density 
function for the random variable 77. 
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The definition of expected value in Equation B .18 has some interesting conse¬ 
quences. For example, we can observe that 

E[ari + b\ =aE[r)] + b (B. 21 ) 

Another result of the definition is that if a set of random variables 77* are independent, 
then it seems reasonable that the expected value of their mapping through g should 
be the same as the mapping of their expected value. In symbols, 


E 


EhiMI = E 


i) 


(B. 22 ) 


and indeed this can be proven to be true. The surprise is that this relationship is also 
true even when the random variables are not independent! Note that this relation 
does not say that E[g(r])} = g(E[r]}); this is usually not a true relation. 


B.5 Measures 

There are several useful measures associated with the expected value of a function; 
we review some of them here. 

The value E[rf] is called the rth moment of rj. Define /z = E[rj\. The value 
/z r = E[(r] - /z) r ] is called the rth central moment of rj. The first central moment 
of 77, which is simply /z, is called the mean of 77. The second central moment of 77, 
written /Z2, is called the variance of 77, also written var(77) or just V . The mean tells 
us something about the value of the random variable, while the variance tells us how 
spread-out the values are around that mean. The standard deviation is defined by 
a — The coefficient of variation is defined by cr//z. 

The standard deviation is an important measure of the degree to which observa¬ 
tions will cluster about the mean value. To illustrate this, we can state the following 
theorem [ 415 ]: Let 77 be a random variable with expected value /z and standard 
deviation a, and let A: > 0. Then 

P(|t7-/z| >ko) < l/k 2 (B. 23 ) 

In words, this says that the probability of getting a result at k standard deviations 
away from the mean is less than 1/A; 2 . 

Some definitions tell us how two random variables are associated with each 
other. Suppose we have two random variables 771 and 772 with means /zi and /Z2. The 
covariance of these two variables, cov( 77 i, 772), is defined by 


cov( 77 i, 77 i) = £[(771 /z 1)(772 ~ M2)] 


(B.24) 
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The independence of the two random variables is a sufficient (but not necessary) 
condition for the covariance to be zero. 

The correlation coefficient p is defined by 

p =— ■= (B.zj) 

yvar(»7i) var(7/2) 

The value of p lies in the interval [— 1 , 1 ]. If p < 0, then 771 and 7/2 are negatively 
correlated ; if p > 0 , then 771 and 772 are positively correlated; and if p = 0 , then the 
variables are said to be uncorrelated . Another useful formula is given by 


var 


/ k v k k 

(£*) = ££ 

X »=l 7 t=l j = l 


cov(Tf iy r)j) 


which is a corollary to Equation B. 25 . 


(B. 26 ) 


B.6 Distributions 


We will be interested in random variables that take on their values according to 
particular patterns, called distributions . A distribution may be thought of as a 
function that specifies cumulative distribution function for a random variable. 

A few distributions are particularly important and are summarized here. The 
normal distribution F n (y) is specified by the mean and variance of the pattern, as 
well as a real value t : 




The exponential distribution F e (y) is specified by a single real value A: 

y < 0 


F e (y) 




e* v y > 0 


(B. 27 ) 


(B. 28 ) 


The rectangular distribution F r (y) is specified by an upper and lower bound, 
given by real numbers a and b: 

0 y < a 

F r (y) = { (V ~ a)/{b -a) a<y<b (B. 29 ) 

1 y > b 


The binomial distribution Fb{y) is specified by two non-negative integers n and 
t, and a real value p € [ 0 , 1 ]. 


t<v 


(B. 30 ) 
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As we saw earlier, this is the distribution of rj occurrences of a desired event with 
probability p out of n trials. 

The values associated with each distribution are called the parameters of the 
distribution, and serve to completely characterize it. We will often guess a random 
variable to have a distribution given by one of the above functions; our job is then 
to find the appropriate values of the parameters. 


B.7 Geometric Series 

We will often encounter expressions of the form 


N -1 


E r ' 


(B. 31 ) 


for some real or complex value r. This is called a geometric series with kernel r. A 
closed form expression for the the first N terms of such a series can be found in most 
calculus texts; we provide it here for reference: 


N -1 

IN 

r = 1 

3 

II 

\~ TN 

r ^ 1 

71=0 

( 1 - r 


(B. 32 ) 


B.8 Further Reading 

For more statistical background, any of a number of classic texts may be consulted; 
I found the book by Papoulis [ 331 ] particularly clear and direct. This appendix has 
presented a classical view of probability. A different approach is advocated by some 
who work in the field of fuzzy logic . A popular introduction to this field is given by 
Kosko [ 253 ]; a textbook approach by the same author may be found in [ 252 ], 




Science is an attempt to develop a system for 
the evolution of constructions of reality, and to 
permit a graceful exit for the dinosaurs. 

Walter Truett Anderson 
Reality Isn't What It Used To Be," 1992) 



HISTORICAL NOTES 


In this appendix we briefly review the history of the laws of specular reflection and 
refraction. They represent one of the most obvious and basic properties of light, and 
have invited analysis for over 2,000 years. 


€•1 Specular Reflection and Transmission 

The law of specular reflection states that the angle of incidence equals the angle of 
reflection ( 0 * = 0 r ). The law of specular refraction (or Snell’s law) is only slightly 
more complicated: it states ifr sin 0* = rj t sin 0 t , where 77* and r) t are the incident and 
transmitted indices of reflection. Although both laws are easily verified in everyday 
situations, it is useful to understand where they come from. 

One of the many accessible explanations arises from Fermat’s law, which leads to 
the concept of optical path length. The development of Fermat’s law has a beautiful 
history, which we recapitulate here briefly. This discussion is based on that in Hecht 
[ 200 ]. 
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MOURI C.l 

Hero’s principle. 


Hero of Alexandria was a mathematician and inventor who lived and died some¬ 
where between the second century B.C. and the third century A.D. His findings are 
as follows. 

Hero’s principle: A ray of light leaving a point S', reflecting off a mirror 
and then arriving at some point P, traverses the shortest possible path 
in space. 

Figure C.l illustrates the geometry of this statement. Although true for reflection 
in a homogeneous material, we know from experience that when light refracts (e.g., 
passing from air into water), it bends and therefore does not follow the shortest path 
in space. 

In 1657 , Pierre de Fermat, a lawyer and amateur mathematician, generalized the 
rule: 


Fermat’s law: A ray of light, in traversing a route from any one point 
to another; follows the path which takes the least amount of time to 
negotiate. 

Fermat’s law is more accurate than Hero’s principle, but it is still incomplete. We 
can generalize it with the idea of the optical path length. 

Figure C .2 a shows a ray of light traveling through a liquid of smoothly changing 
densities. At each point of the liquid the index of refraction is determined by the 
local density. Thus the time of flight required by the ray to travel from point P to 
point Q is given by 

pQ 

t = j K,(s)ds 


(C.l) 
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riOURI C.2 

A ray of light traveling through a liquid of smoothly changing densities. 


where k(s) represents the index of refraction as a function of position s along the 
ray. If we replace the continuous medium with a series of thin slabs, each with index 
of refraction rjj 9 and the ray travels a distance dj in each medium, then the time of 
flight is given by the discrete form 


t = ~ r Yl d j^ < C2 > 

3 =1 

As the slabs become thinner, they approach a medium of smoothly changing 
density, as in Figure C. 2 . The time of flight (in either form) is referred to as the 
optical path length (OPL). We may now paraphrase Fermat: 

Reworded Fermat: A ray traverses a route that corresponds to the 
shortest optical path length. 

Although this is more succinct, we have not improved the statement. What is 
wrong with it? Consider light that is leaving a point source P and subsequently being 
focused by a lens onto another point Q, as in Figure C. 3 . Clearly there are many 
paths that light can, and does, take. Fermat’s law is too restrictive when specifying 
the shortest optical path length. Correcting this restriction involves the stationary 
values of the OPL. 

A function f(x) has a stationary value at x = xo if df /dx = 0 at xo. Thus a 
stationary value may correspond to a maximum, minimum, or point of inflection 
with horizontal tangent, as shown in Figure C. 4 . One implication of a vanishing 
derivative is that for a stationary value f(x o), the value f(x) « f(x o) when x « xo . 

We can now paraphrase Fermat’s law in more general terms: 
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A case where Fermat’s law doesn’t hold. 



MOURI C.4 

A stationary value is a point of zero derivative; that is, a minimum, maximum, or inflection. 


Fermat’s law (modern form): A ray of light, when traveling from one 
point to another, follows a path that corresponds to a stationary value 
of the optical path length. 

This agrees with the example of the focused point source, since each of the many 
paths has the same optical path length. (The light near the edge of the lens travels a 
longer distance through the air than the light at the center of the lens, but the former 
travels only a small distance through the glass relative to the latter. Since the light 
travels more slowly in the glass, the times equal out.) 
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MOURI C.S 

The geometry of specular reflection. 


C. 1.1 Specular Reflection 

We will use now the modern form of Fermat’s law to derive the law of reflection. 
Figure C .5 shows a light source P, a mirror M, a reflection point Q , and a reflected 
point R. The angle of incidence is 0 , and the angle of reflection is 0 r ; we wish 
to find their relationship. We will follow the development in Hecht [ 200 ]. For 
convenience and clarity, we will write the index of refraction as a constant, 77, rather 
than explicitly writing the wavelength-dependent form 77(A). 

Assume that the ray travels through a homogeneous material of index of refraction 
77. The optical path length is thus given by the distances involved: 


OPL = r]PQ + r]QR 

= r\\Jh 2 + x 2 -f rjy/b 2 - (a - x) 2 


(C. 3 ) 


To find the stationary points of this OPL, we find d(OPL)/dx and set the result 


to 0: 


V x 77 (a-x) 


\/h 2 4- x 2 y/b 2 + (a — x) 2 


(C. 4 ) 


Note from the figure that 


sin(0,) = 
sin(0 r ) = 


x 


\/h 2 4- x 2 
a — x 

y/b 2 + (a - x) 2 


(C. 5 ) 


(C.6) 
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The geometry of specular transmission. 


Thus we may rewrite Equation C.4 as 

0 = 7]sin(Oi) - 7]sin(0 r ) (C.7) 

which brings us to the law of reflection: 

Oi = 6 r (C.8) 

The law of reflection says that for the light to take a path corresponding to the 
shortest OPL, it must reflect at a point such that the angle of incidence equals the 
angle of reflection. 


€• 1.2 Specular Transmission 

We will now use the modern form of Fermat’s law to derive the law of specular 
transmission; the discussion will follow very similar lines to the ones that led to the 
law of specular reflection. 

Figure C.6 shows light leaving a source P, striking an interface at point Q, 
and arriving at point R. Suppose that the media containing P and Q are distinct, 
homogeneous media with indices of refraction Tji and Tj t . The angle of incidence is 0* 
and the angle of refraction (or transmission) is 6 t ; we wish to find their relationship. 
We proceed as before, first writing the expression for the optical path length: 

OPL = r)iPQ + rftQR 


= 7}i y/h 2 + x 2 + 7] t y/b 2 - (a - x) 2 


(C.9) 
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fleet! 


d 
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We now find the derivative and set it to zero: 


Q= mx rjtia-x) 


y/h 2 + x 2 y/b 2 + (a — x) 2 


Note from the figure that 


sin(0i) = 
sin(0 r ) = 


x 


y/f? 


a — x 


y/b 2 + (a - x) 2 


Thus we may rewrite Equation C.10 as 


0 = r]i sin(0i) - f] t sin(0 t ) 
This brings us to the law of refraction: 

rjiS\n(0i) = rj t sm(6 t ) 


(C.10) 

(C.ll) 

(C.12) 

(C.13) 

(C.14) 


which is Snell’s law. This is the relationship between the angles of incidence and 
refraction, as we desired. 

In this section we have derived the relationship between incident, reflected, and 
transmitted angles by assuming that light took a path governed by the modern form 
of Fermat’s principle. We might wonder how a ray of light “knows” the right 
path to follow—might it ever take another path only to find that it is not the right 
one? This and other questions are explored at a conversational level in the context 
of modern quantum mechanics in Feynman’s wonderful little book on quantum 
electrodynamics [144]. 




If there exists any one reliable algorithm for 
finding the roots of transcendental equations it 
is yet to be found. We have a variety of 
medicines that work with varying degrees of 
potency (including zero!), but the state of the 
art still precludes the confident writing of 
computational prescriptions without having 
looked over the patient rather closely. 

Forman S, Acton 

(“Numerical Methods That Work/* 1970) 



A N A L Y T 


G FORM FACTORS 


This appendix contains a number of useful analytic form factors for different geo¬ 
metric situations. These form factors are adapted from the catalogs in Siegel and 

Howell [406] and Howell [215]. These references contain many hundreds of analytic 
form factors for many general and specialized geometries. 


D.l Differential and Finite Surfaces 

This appendix organizes the form factors into three categories, depending on whether 
the two patches are both differential, both finite, or mixed. 


D«1 «1 Differential to Differential 

DD1: Differential patch to differential patch. See Figure D.l. 

cos 6\ cos 62 . A 

FdA u dA 2 — -2- dA2 

nr 




MGURI D.1 


Geometry for DD1: differential patch to differential patch. 

D. 1 *2 Differential to FinBo 

DF1: Differential patch to finite patch. See Figure D.2. 

f COS0iCOS0 2t//d , a 

FdA x ,A? — / -2- V(P\,P2)dA2 

Ja 2 7T r 


DF2: Differential plane element to plane parallel rectangle; normal to element passes 
through corner of rectangle. See Figure D.3. 

X = a/c, Y = b/c 


FdA . _ 1 [ * 

27T [vT+X 5 


tan 


-l 


-I- 


Vi+ x 2 vTTT 2 


: tan 


-l 


VTTY2J 
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Geometry for DF3: differential plane element to rectangle in plane at right angle to plane of element. 


DF3: Differential plane element to rectangle in plane at right angle to plane of 
element. See Figure D.4. 


X = a/b , Y = c/6 


FdA u A 2 = 


2tt 


tan" 1 - - ' 

Y y/x 2 + Y 2 


tan 


-l 


y/X* + Y*. 


DF4: Plane differential element to circular disk in plane parallel to element; normal 
to element passes through center of disk. See Figure D.5. 


FdA u A 2 — ^2 _|_ r 2 


DF5: Plane differential element to circular disk in plane parallel to element. See 
Figure D.6. 

H = - R=- Z = \ + H 2 + R 2 

a a 


RdA\,A? g 


1 + H 2 -R 2 ' 
VZ* - 4i? 2 . 








Geometry for DF4: plane differential element to circular disk in plane parallel to element; normal 
to element passes through center of disk. 



MOURI D • 6 

Geometry for DF5: plane differential element to circular disk in plane parallel to element. 

DF6: Spherical point source to a sphere of radius r. See Figure D.7. 
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MOURI D.7 

Geometry for DF6: spherical point source to a sphere of radius r. 


DF7: Plane differential element to sphere of radius r; normal to center of element 
passes through center of sphere. See Figure D.8. 

FdA " A * = © 2 


DF8: Plane differential element to sphere of radius r; tangent to element passes 
through center of sphere. See Figure D.9. 


H= h - 

r 


F dA u A? = ~ 
7T 


tan 


-l 


sjH 2 - 1 


VJp - 1 
H 2 


DF9: Sphere to disk; normal to center of disk passes through center of sphere. See 
Figure D.10. 


FAuA, = 


1 


1 - 


y/l + R? 



















FI O U ft I D.S 

Geometry for DF7: plane differential element to sphere of radius r; normal to center of element 
passes through center of sphere. 



PIOURI D • 9 

Geometry for DF8: plane differential element to sphere of radius r; tangent to element passes 
through center of sphere. 



PIOUBI D.IO 

Geometry for DF9: sphere to disk. 
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Geometry for DF10: differential element perpendicular to standing person facing it. 


DF10: Differential element perpendicular to standing person facing it. Distances are 
measured in feet, weight in pounds. See Figure D.ll. 


FdAi,A 2 = 


x [0.65 + cos a (0.715 + 0.52| cos^|)] hw 1 / 3 
30.8(x 2 + y 2 - 1- z 2 ) 15 


DF11: Differential element perpendicular to sitting person facing it. Distances are 
measured in feet, weight in pounds. See Figure D.12. 
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Geometry for DF11: differential element perpendicular to sitting person facing it. 


DF12: Differential element to cow. Distances measured in meters. (This is an 
approximation based on using an ideal spherical cow of radius R .) See Fig¬ 
ure D.13. 


r =-l 


x = l s 


Y = i 


FdA 1 ,A 2 = 


(i + x 2 + y 2 ) 3 / 2 







PltUKI D.13 


Geometry for DF12: differential element to cow. 


D.1.3 Finite to Filtito 

FFl: Finite patch to finite patch. See Figure D.14. 


FAi,A- 


= -! f 

M Jai Ja 2 


COS 6\ COS 02 


nr 


V{P u P 2 )dA 2 dA l 


FF2: Strip element to rectangle in plane parallel to strip; strip is opposite one edge 
of rectangle. See Figure D.15. 


_ i 

PdA\,A 2 Yff 


A = a/c, 


\/T Ty* tan " 1 . = = - tan " 1 X + r tan -1 -=L= 

vi + y 2 VI+ X 2 Vl + X a J 


■ U f c 


XY 





MOURI D • 1 4 

Geometry for FF1: finite patch to finite patch. 



MOURI D.1S 

Geometry for FF2: strip element to rectangle in plane parallel to strip. 
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FIOVRR 0.14 

Geometry for FF3: strip element to rectangle in plane at right angles to strip. 


FF3: Strip element to rectangle in plane at right angles to strip. See Figure D.16. 


F dA\ ,Ai 

7T 


X = a/b, Y = c/b 

,i y. y 2 (x 2 + y 2 + i) 

tan 1 - + — In--- - - 


y 2 (y 2 + i)(x 2 + y 2 ) ^x 2 + y 2 


tan 


-l 


VX 2 + Y 2 


pp4: Identical parallel directly opposed rectangles. See Figure D.17. 

X = a/c, Y = b/c 


F A x ,Ai = 


xy* 


to \/ (1 T 


+ y 2 


+ y n/T+X 2 tan' 1 


1 + X 2 


- X tan -1 X — y tan -1 y 
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flSURI D.17 

Geometry for FF4: identical parallel directly opposed rectangles. 


FF5: Two finite rectangles of same length, with one common edge at right angles to 
each other. See Figure D.18. 


H = h/l, W = w/l 


F Ai ,a x = ^ W tan" 1 + H tan' 1 ^ - ^/iP + W 2 tan" 1 


VIP + W 2 


1 f (l + ^ 2 )(l + iJ 2 ) 
4 | 1 + W* + H ‘ 2 


W 2 (l + W 2 + H 2 ) 


(1 + W 2 )(W 2 + H 2 ) 


w* 


H 2 { 1 + W 2 + if 2 ) 
(1 + H 2 ){W 2 + tf 2 ) 


FF6: Parallel circular disks with centers along the same normal. See Figure D.19. 




X = 


1 + 


l + fi 2 2 
Ri 2 


FiAt ,Aj 
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Geometry for FF6: parallel circular disks with centers along the same normal. 
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Geometry for FF7: concentric spheres. 

FF7: Concentric spheres. See Figure D.20. 

f a u a 2 

F A 2 ,A\ 

Fa 2 ,a 2 



FF8: A differential ring on a disk to a sphere whose normal through center passes 
through sphere center. See Figure D.21. 

R\ = r\/a R 2 — r 2 /a 


F = ^ 

dA " A2 (l + fl! 2 )3/2 
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FI8URI D • 2 1 

Geometry for FF8: a differential ring on a disk to a sphere whose normal through center passes 
through sphere center. 



FI8URI D.31 

Geometry for FF9: parallel squares of different sizes. 
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FF9: Parallel squares of different sizes. See Figure D.22. 

A = a/c B = b/a X = A(1 + B) Y = A(1 - B) 
A<0.2:F AltAa = ^¥- 


A> 02F 1 

A_0. . Fai ,a* wA2 < 1 ( y 2 + 2 )( X 2 + 2 ) 


+VY 2 + 4 


WX 2 + 4 X tan -1 


. ;f,aD ' 1 ( 7 TO)- rta "‘ , ( 7 TO)]} 


FF10: Parallel rectangles with parallel sides. See Figure D.23. 

X = x/z N = T\f z 

Y = y/z S = Z/z 

oti,i = Si- Xi fa j = Nk-Yj 

^ 2 2 2 2 


Fa " M (X 2 - X x ){Y 2 - Yi) [ ( l) (,+J+fc+,) G(a M ,^ > i)] 


G(a lyi ,(3k,j) = < ai,i\J\ + 0k/ tan 1 | 

{ \Jl +Pk,j 2 


-0k,j tan 1 Pk,j + J 1 + <xi,i 2 Pk,j tan' 


^ V Vvl + °W / 

<*M 2 In oi,< + (1/2) In (1 + /3 fcii 2 ) - (1/2) In [l + a t , 2 + 0k,j 2 ]} 


-1 ( Pk,j 


FF11: Infinite concentric cylinders. See Figure D.24. 

f m,a 2 = 1 

rp _ rl 

Fa„a, - - 
Faa -1 - - 
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PIOURI D.23 

Geometry for FF10: parallel rectangles with parallel sides. 



PIOURI D.24 

Geometry for FF11: infinite concentric cylinders. 















FIOURI D.25 

Geometry for FF12: sphere to a scalene triangle. 


FF12: Sphere to a scalene triangle. The normal to the triangle through one vertex 
passes through the center of the sphere. The plane of the triangle does not 
intersect the sphere. See Figure D.25. 

B\=b\/a B 2 = b 2 /b\ B$ = 63/61 


FdAt,A 2 = 4“ [ C °S _1 (^)-COS- 1 

1 f . _,[(!-*’) fl 3 2 - 

8n [ (1 + B 2 2 )Bj 


FF13: Large sphere to a smaller hemisphere, r > r^, ignoring base of hemisphere. 
See Figure D.26. 
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PIOURI D.26 

Geometry for FF13: large sphere to a smaller hemisphere, r ignoring base of hemisphere. 


D.2 Two Polygons 

A closed-form expression for the form factor between two polygons that are com¬ 
pletely visible to each other has been developed by Schroder and Hanrahan. They 
report a summary of their work in [386] and give a more detailed explanation in 
[385]. The result is quite complex. The following summary of the mathematics is 
from [385] and appears courtesy of Peter Schroder. 

Before listing all the expressions to be computed we first define four auxiliary 
functions 


m(y) = 




( 6 - 1 )" 


ln(l + y) + 


2 (b-y) 


(6 2 — l)(y 2 _ l) 


2(6 + t/)(l 4- by) ((b - y) 2 + (by - l) 2 ) (1 - y)(l - fr) 

+ n (l + y)(l + 6) 


V (6 2 - l) 2 (y 2 - l) 2 

Lh (irf) - Lh (r^f)) 


) 


ln(6 + y) 


M(y) = 

G(q)(y) = 


y 


4(y 2 - l) 2 

q'(y) 


y , J_ - y_zl 

8(y 2 - 1) 16 y+ 1 


— lnq(y)-2y+^ta.n 1 
Zcl CL Cl 

2 


»(?>(») = (y + £-j£s) I-i(s>- 


yjgy - 6) 

2 a 
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where q(x) = ax 2 + bx + c is some arbitrary quadratic polynomial, d = y/4ac — 6 2 , 
and 

00 fc 

Li 2 (z) = £ p 

is the dilogarithm (see [270]), closely related to the logarithm In ^ ts 

series representation is absolutely convergent in the unit disk. Using the functional 
relationship 

-7T 2 ln 2 ( —z) , 

£* 2 (*) = -g- 2 -' 

the dilogarithm is defined in the entire complex plane. Efficient code for the evalua¬ 
tion of the dilogarithm function can be found in most special function libraries, e.g., 
fn from the mail server at netlib@research. att. com. 

Given two edges we first compute the bi-quadratic form parameterizing the dis¬ 
tance between the two edges as a function of s and t. Let E{ and Ej be parameterized 
by Xi(t) = pi + tdi and Xj(s) = pj -I- sdj with ||dj,j|| = 1, respectively. We have 


co 

Cl 
C2 = 
C3 = 
C 4 = 
C 5 = 
ClO = 
Cll = 

Cl2 = 
Cl3 = 

Cl4 = 
Cl5 = 

cig(s) = 
cn(s) = 

cis(s) = 


= \\Ej\\ 

= —2d{ • dj 

= \\Ei\\ 

= “ 2 dj • (pi - pj) 

= 2 di • - pj) 

= \\Pi-Pj\\ 2 
= 4-c? 

= 4c 4 - cic 3 
= 4 c 5 - ej 

cn ~ y/cl\ — 4ciqCi2 

2cio 

\Zci! — 4C10C12 

ClO 

\/ClOCi 4 

C1C13 - c 3 - 2s 
-C15 + x/cfs-4 |c 16 (s)| 2 
-2lc 16 (s) 

-C15 - y/<4~ 4 l C 16(g)| 2 
-2icj 6 (s) 
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With these in hand we can compute the integral for a pair of edges 

c 2 rco 


fC2 [Co 

X(Ei,Ej)= / / In f{s,t)dsdt 

Jo Jo 

(* + |) G(f(s ,.))«) + .))(<) 

[tt(2A:(s) + 1 )M(t) 


S=CQ,t=C2 


- 2cqc 2 




+ C14C15 


-i |l(-c 17 (s))(<) + L(-c 18 (s))(*) - L(c 17 (s))(<) - L(c 18 (s))(*)} 

in terms of which the form factor for two polygons is given by 

Fp ' n = i^, £ 




J=0 ,«= x /|ir 
v ^ 13 


EiZdPy 

EjedP 2 


Converting these equations into running code is difficult; there are many subtleties 
that must be carefully handled. Peter Schroder has released his implementations in 
the Matkematica and C languages to the public domain. If they are not available to 
you via some local source, you can obtain them by anonymous ftp transfer from the 
computer ftp.cs.princeton.edu (internet address 128.112.92.1). Some of the 
files are compressed, so the binary transfer mode must be used (give the command 
binary to your ftp server). Under the directory pub/packages/formfactor 
are four files: 

f f . m An implementation in the Matkematica language. 

f f paper. ps . Z A compressed version of the PostScript for the paper [386]. 

f f tr .ps. Z A compressed version of the PostScript for the technical report [385]. 

libff .tar.Z A compressed version of an implementation in the C language in 
Unix-library form. 

Files ending in the suffix .Z are compressed; under Unix, run the program 
uncompress to expand them. The library is in the tar archival tape storage 
format. To extract the source files, under Unix run tar xvf libff .tar in an 
empty directory. 



/ give you now Professor Twist, 

A conscientious scientist. 

Trustees exclaimed, "He never bungles!" 
And sent him off to distant jungles. 
Camped on a tropic riverside, 

One day he missed his loving bride. 

She had, the guide informed him later. 
Been eaten by an alligator. 

Professor Twist could not but smile. 

“ You mean, ” he said, “a crocodile." 

Ogden Nash 
(“The Purist,” 1935) 



CONSTANTS AND 


UNITS 


Quantity 

Unit name 

Symbol 

Definition 

Unit of length 

meter 

m 

The length of the path traveled by light in a vacuum 
during a time interval of 1/299,792,458 of a second 

Unit of mass 

kilogram 

kg 

A mass equal to the mass of the international prototype 
of the kilogram (an alloy of platinum with 10% iridium, 
maintained in the Archives of France) 

Unit of time 

second 

s 

The duration of 9,192,631,770 periods of the radiation 
corresponding to the transition between the two hyper- 
fine levels F = 4, M = 0 and F = 3, M = 0 of the 
ground state 2 Sx/ 2 of the cesium-133 atom unperturbed 
by external fields 

Unit of luminous 
intensity 

candela 

cd 

The luminous intensity, in a given direction, of a source 
that emits monochromatic radiation of frequency 540 x 
10 12 hertz and that has a radiant intensity in that direc¬ 
tion of 1/683 wan per steradian 


TAILS 1.1 

Definitions of the four basic units from the ANSI standard (432). 
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Factor 

Prefix 

Symbol 

Factor 

Prefix 

Symbol 

10 24 

yotta 

Y 

10" 1 

deci 

d 

10 21 

zetta 

Z 

10- 2 

centi 

c 

10 18 

exa 

E 

10" 3 

milli 

m 

10 15 

petta 

P 

10"® 

micro 

M 


tera 

T 

10" 9 

nano 

n 


giga 

G 

10- 12 

pico 

P 

10® 

mega 

M 

10-15 

femto 

f 


kilo 

k 

10-18 

atto 

a 


hecto 

h 

10- 21 

zepto 

z 

10 1 

deka 

da 

10- 24 

yocto 

Y 


TA3LI 1.2 

Basic engineering prefixes for different orders of magnitude. Note: The prefix “deka” is often 
written u deca.” Source: Data from the ANSI standard [432]. 


Constant name 

Symbol 

Value 

Boltzmann’s constant 

k 

1.38066 x 10" 23 J /°K 

Planck’s constant 

h 

6.62620 x 10" 34 J • s 

Speed of light 

co 

299,792,458 m/s 

Stefan-Boltzmann constant 

<T 

5.67032 x lO' 8 W ■ m~ 2 • °K" 4 ) 

Solar constant 

E a 

1.35 x 10 3 W/rri 2 

Permittivity of vacuum 

Co 

4.85 x 10 -12 farad/m = 4.85 x 10- 12 (>4 • a)/(V • m) 

Permeability of vacuum 

MO 

4ir x 10 7 henry/m = 4ir x 10 7 (V • s)/(A ■ m) 


TABLE 1*3 

Physical constants. 






1.0079 4.00260 



TAB LI 1.4 

The periodic table of the elements. 











































































“If you had high hopes, how would you know 
how high they were? And did you know that 
narrow escapes come in different widths; 

Would you travel the whole wide world without 
ever knowing how ivide it was? And how could 
you do anything at long last," he concluded, 
waving his arms over his head, “without 
knowing how long the last was?” 

Norton juster 

("The Phantom Tollbooth,” 1961) 



LUMINAIRE STANDARDS 


In this appendix we summarize two standards for representing luminaires . In general, 
a luminaire is a complete physical structure for illumination that is composed of a 
lamp , a housing , and associated electrical and electronic support. The lamp is where 
the light is actually generated, using a variety of methods (e.g., a glowing filament, a 
carbon arc, or fluorescing gas). The housing is usually a metal container in which the 
lamp is mounted, and can usually be aimed in a variety of directions. The electrical 
apparatus supports the needs of the lamp for safety, efficiency, and longevity. 

The language of the standards is challenging. Rather than use the precise (and 
often awkward) language required by a complete standard, in this appendix I will 
summarize the language and give working descriptions of the standards. The reader 
thirsty for more detail can obtain copies of the full standards directly from the issuing 
agencies [222,434]. 


F.l Terminology 

The general idea of both formats is to first describe the luminaire and the lamp in 
physical terms, and then provide a set of photometric measurements of the output 
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rieufti p.i 

The three axes for a luminaire. 


of the luminaire. These measurements are most naturally described using a spherical 
coordinate system. 

We will describe a luminaire by placing it at the center of a Euclidean coordinate 
system. Using traditional terminology, we call the axes the first axis , the second axis , 
and the third axis. We will abbreviate these as Ai , A 2 , and A$. They are illustrated 
in Figure F.l. 

There are three ways to position the spherical coordinate system with respect to 
the luminaire, depending on which axes contain the poles. These are called the three 
goniometric configurations. Each coordinate system is represented by two angles: 6 
describing a point on a great circle around the sphere (the equator ), and 'ip identifying 
points on a great hemicircle running from one pole to the other. Traditionally these 
angles are measured in degrees. 

In all systems, the sense of positive angles obeys the right-hand rule: wrap your 
right hand around the axis with thumb extended so that your thumb points away 
from the origin; the direction in which your fingers curl is the direction of positive 
rotation. 

If the poles are placed along the first axis, then we call this a (C, 7 ) (or C) 
type system, illustrated in Figure F.2. This type of arrangement is normally used 
for indoor and roadway luminaires. The angle 7 € [ 0 °, 180°] measures the polar 
rotation around axis .A 3 . The angle C € [0°,360°] describes the equator rotation 
around axis A\. 

If the poles are placed along the second axis, then we call this a ( B , 0) (or B) 
type system, illustrated in Figure F.3. This configuration is used mostly for adjustable 





HOUR! P.2 

The (C, 7 ) coordinate system. 












FI O U ft I F.3 

The ( B , (3) coordinate system. 
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floodlights and sports lighting. The angle ft e [—90°, 90°] measures the polar rotation 
around axis A 3 . The angle B € [0°, 360°] describes the equator rotation around axis 
A2. 

If the poles are placed along the third axis, then we call this a (A, a) (or A) type 
system, illustrated in Figure F.4. This is normally used for headlights and vehicle 
signal lighting. The angle a € [—90°,90°] measures the polar rotation around axis 
A 2 . The angle A e [0°, 360°] describes the equator rotation around axis A 3 . 

It is important to observe that these terms, though standardized, are not used 
consistently. In particular the (A, a) and (C, 7 ) systems are often given the other’s 
names and labels [222]. Caveat emptor. 


F.2 Notation 

Both standards are based on human-readable text files of letters, characters, and 
punctuation. It would be very pleasant to be able to describe the files using the 
standard grammar notation of computer science [271], but many of our standard 
metacharacters (such as the asterisk * and the square brackets [ and ]) are used in the 
standards documents with different meanings. I think it would be confusing to use 
those symbols here in a way so different from their use in the reference documents. 
So instead I have adopted a different, more direct style of representation. 

The file formats will be shown by explicit presentation. Strings in a typewriter 
font are required. A field in an italic font represents the name of some data. I mark 
the type of data as integer (Z), floating-point real (72), or alphanumeric (A). 

Horizontal bars in the presentation are used just for conceptual grouping and are 
not part of the file. 

Often the standard requires that certain fields must begin new lines. I have found 
it easier to indicate where previous lines must end. The hook-left arrow indicates 
a combined carriage return-line feed; in ASCII that’s octal 015 followed by octal 
012 . I will refer to this combined pair of characters in text with the term “newline.” 


F.3 The IES Standard 

The Illumination Engineering Society of North America (or IESNA, or simply IES) 
has designed a standard for luminaire description [222]. The standard consists of 
a file, which contains a main block , and a photometry block. The two blocks may 
exist in different files, with the main block providing the name of the file containing 
the photometry block. There is no explicit symbol that indicates the end of the main 
block and the start of the photometry block. The photometry block is defined to 
begin with the first line that begins with the string TILT. 




PIOURI P.4 

The ( A , a) coordinate system. 
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The main block describes the physical structure of the luminaire. The photom¬ 
etry block describes how it was measured and provides the actual data. A set of 
measurements is sometimes called a report. 


ff.3.1 TfcoMg Pictoro 

The main and photometry blocks are summarized in Figure F.5. Lines up to the 
TILT line may be no longer than eighty characters; the TILT line and all lines after 
it may be no longer than 132 characters. 

The first line of the file must be the seven characters IESNA91 followed by a 
newline. 

Then comes a number of lines that describe the luminaire. Each line begins with 
a keyword followed by an equals sign. The rest of the line contains the data that 
describes that keyword. Other keywords may be used, but they should be in all 
uppercase and no more than twenty characters long, including the brackets. The 
standard is somewhat contradictory in its use of square brackets. IES-approved key¬ 
words should certainly be surrounded by square brackets; user-invented keywords 
may or may not require brackets. 

As a practical measure, if a string in all capital letters starts a line before the line 
beginning with TILT, I suggest interpreting it as a keyword. 

The IES standard requires that the keywords TEST and MANUFAC be present; 
the others are optional. They recommend that LUMCAT, LUMINAIRE, LAMPCAT, 
and LAMP also be included, but they aren’t mandatory. The list of IES-supported 
keywords is given in Figure F.6. 

The MORE keyword can be used to continue the argument from a previous line 
if it was too long to fit. The BLOCK and ENDBLOCK keywords allow us to insert an 
additional set of keywords to attach to the photometry block. That way, one set of 
photometric data can be associated with a number of different luminaires. Although 
they don’t say so explicitly, by implication BLOCK and ENDBLOCK nest arbitrarily 
deep, though there is no advantage to nesting. 


H3.2 TfcoTIHMock 

The photometry block begins with the string TILT. This is the beginning of a smaller 
block called the tilt block , which may or may not be present. The argument to 
TILT indicates if there is no tilt block at all (TILT=NONE), if it is in another file 
(TILT -filename)^ or if the tilt block is included in this file (TILT=INCLUDE); the file 
name may be no more than seventy-five characters long. 

If there is a tilt block, then it either comes from another file or appears immediately 
after the TILT line. The format is the same in either case. 
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IESNA91 <-* 

Required identifier. 

[TEST] test-report-information (A) ,, 

. ..Required key information. 

[ MANUFAC ] manufacturer-of-luminaire (*4) <r-* 

[keyword] key-information (*4) 

[keyword ] key-information (*4) <-> 

[keyword] key-information (A) 

0 or more keyword lines. 

TILT=filename <-* 

TILT=INCLUDE <-* 

TILT=NONE ^ 

Choose only one. 

Tiltblock «-» 

Only present if TILT=INCLUDE. 

number-of-lamps (Z) 
lumens-per-lamp (1Z) 
candela-multiplier (11) 
number-of-vertical-angles (Z) 
number-of-borizontal angles (Z) 
photometric-type (Z) 
units (Z) 
width (11) 
length (1Z) 
height (1Z) 

Luminaire description. 

ballast-factor (11) 
ballast-lamp-factor (11) 
input-watts (11) 4-* 
vertical-angles (1Z) 
horizontal-angles (1Z) 

Measurement description. 

Cl (11) 

Cl (1l) <-* 

Candela data at each horizontal angle. 

cn (n) 



FI O U ft I F.S 

Main block of IES standard. 
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Keyword 

Argument 

type 

Purpose 

TEST 

(A) 

Test report and laboratory (mandatory). 

MANUFAC 

(A) 

Manufacturer of luminaire (mandatory). 

LUMCAT 

(A) 

Luminaire catalog number (recommended). 

LUMINAIRE 

(A) 

Luminaire description (recommended). 

LAMPCAT 

(A) 

Lamp catalog number (recommended). 

LAMP 

(A) 

Lamp description (recommended). 

BALLAST 

(A) 

The ballast used in the measurements. 

MAINTCAT 

(2 6 [1,6]) 

An integer from 1 to 6 identifying the maintenance category. 

OTHER 

(A) 

Free field for any other information. 

SEARCH 

( A) 

For systems without a general text-search facility, we can pro¬ 
vide a string here and flag it for an external program. 

MORE 

(A) 

Extends the description from the previous line in the file. 

BLOCK 

none 

Allows grouping (see text). 

ENDBLOCK 

none 

End of a block (see text). 


PIOURI P.6 

Keywords for the IES main block. 


The tilt block is used because sometimes the efficiency of the lamp is sensitive to 
its orientation, so that it is brighter when pointing in some directions than others. 
The tilt specification allows us to describe how the lamp is mounted in the housing, 
and how its output depends on the rotation of the housing. 

The tilt block has the form given in Figure F.7. Notice that there are no keywords 
in the block; everything must be of the right type and in the right place. 

The first field, labeled lamp-to-luminaire-geometry , is an integer with the value 
1, 2, or 3, depending on how the lamp is mounted in the housing. Figure F.8 shows 
the three choices. In each of these three figures, the lamp is in a fixed position with 
respect to the housing, and the housing is rotated about a mounting axis. 

For Type 1 mounting, the luminaire points straight out of the housing, so it always 
points in the same direction as the luminaire. For Type 2 mounting, the position of 
the lamp doesn’t change as the luminaire is rotated, since it is parallel to the axis of 
rotation. For Type 3 mounting, the lamp is perpendicular to the axis of rotation, so 
it moves as the housing moves. 
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lamp-to-luminaire-geometry (Z G [1,2,3]) 4-> 

number~of-pairs (Z) > 

list of angles (H y 7£, 7£,...) •<-> 

list of factors (K y K y K y ...) <-> 


HOUR! P.7 

The IES tilt block. 





PI0URI P.8 

The three choices for lamp-to-luminaire-geometry. 
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2 

5 

0 5 10 20 45 

1.0 0.9 0.8 0.6 0.5 


MBURI P.9 

The tilt block for a lamp in a Type 2 mounting that had been measured at five different angles: 0, 
5,10, 20, and 45 degrees. 


Following the mounting type comes an integer that specifies the number of mea¬ 
surements that have been made of lamp output with respect to rotation. Then comes 
an ascending list of all the angles (in degrees), and then the relative lamp outputs. 
An example is shown in Figure F.9. 


P.3.3 TIm Photometry Block 

The photometry block contains a sequence of unflagged data values. Because there 
are no keywords in this block, everything must appear with exactly the right type, in 
exactly the right place, as in Figure F.5. The first set of fields addresses the physical 
setup of the luminaire and the configuration for measuring its output; we will now 
discuss those fields sequentially. 

number-of-lamps (Z): This is the number of lamps mounted in the luminaire; for 
example. Figure F.10 shows several bulbs mounted in a single housing. 

lumens-per-lamp (71): This is the average number of lumens per lamp in the lumi¬ 
naire. 

candela-multiplier (72): The photometric data at the end of the file may be uniformly 
scaled by this value; it is normally 1.0. 

number-of-vertical-angles (Z): This is the number of angles that were measured from 
pole to pole for this report. 

number-of-horizontal-angles (Z): This is the number of angles that were measured 
around the equator for this report. 

photometric-type (Z): This integer identifies which of the three goniometric config¬ 
urations was used to measure the data. A value of 1 means type C, a value of 
2 means type B, and a value of 3 means type A. 
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rieufti p.io 

Several lamps in a single luminaire. 


units (Z): Identifies the measurement system. A value of 1 means units are in feet, a 
value of 2 means units are in meters. 

width (72): The width of the luminous opening, measured parallel to the A 3 axis; 
see Figure F.ll. 

length (72): The length of the luminous opening, measured parallel to the A 2 axis; 
see Figure F.ll. 

height (72): The height of the luminous opening, measured parallel to the A\ axis; 
see Figure F.ll. 

When a luminaire is not rectangular, the width, length, and height measurements 
aren’t very useful. Table F.l indicates how to use these fields to encode circular, 
elliptical, and point sources. 

We now continue through the photometry block to the fields that address the 
electrical test setup. 
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MOUKI P.11 

The width, length, and height of a rectangular luminaire with respect to its axes. 


Opening shape 

Width 

Length 

Rectangular 

width 

length 

Circular 

-diameter 

0 

Elliptical 

-minor axis 

major axis 

Point 

0 

0 


TABLI P.1 

Use of the width and length fields for nonrectangular luminaires. 


ballast-factor (H): Some lamps are operated on a ballast that can diminish their 
output. This is the percentage by which the output of the lamp diminishes on 
a ballast. 

ballast-lamp-factor (K): If the measurements in the file used a different ballast than 
in a standard installation, this factor gives the correction to turn the file data 
into the installation data. 


input-watts (TVji This is the total watts applied to the luminaire for the test (including 
ballast, if any). 
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vertical-angles (72): This is a list of the vertical angles where measurements were 
taken, in degrees, in ascending order. For type C measurements, the first value 
is always 0 or 90, and the last is always 90 or 180. 

horizontal-angles (72): This is a list of the horizontal angles where measurements 
were taken, in degrees, in ascending order. For type C measurements, the first 
value is always 0. The last value is interpreted as follows: 

0: There is only one horizontal angle, and the luminance is assumed to 
be symmetrical about this angle. 

90: Only one quadrant of data is provided; the luminance is symmetric 
with respect to each quadrant. 

180: Only half of the sphere is provided; the luminance is bilaterally sym¬ 
metric. 

Other: This is a value from 180 to 360; the luminance has no lateral symme¬ 
try. 

For types A and B there are two general cases: 

■ The luminance is laterally symmetric about a vertical reference plane. 
Then the first angle is 0 and the last is less than 90. 

■ The luminance is not laterally symmetric about a vertical reference plane. 
The first angle is between -90 and 0, and the last is between 0 and 90. 

Finally come the measurements themselves, in candelas . We start at the first 
vertical angle and list the candela output of the luminaire at each of the horizontal 
angles. Then we move to the next vertical angle and start the horizontal list over 
again, generating a mesh of values. These are simply long lists of floating-point real 
numbers, separated by white space, commas, or both, and interrupted by mandatory 
newlines at the end of each set of horizontal measurements. Other newlines are 
permissible within the data. 

Figure F.12 shows an example file in the IES format. This example contains 
nonsense data that are only intended to demonstrate the format. 


M Th# Cli Standard 

The Commission Internationale de L’fxlairage (the CIE) has developed an interna¬ 
tional standard that is capable of much richer expression than the IES standard. Like 
the IES, the CIE has chosen a plain-text file format, and has separated the data into 
blocks. In contrast to the IES, all data in the CIE standard is identified by a keyword. 
The conventions are somewhat different, though. 
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IESNA91 


[TEST] 

Luminaire C6567681 

[MANUFAC] 

Deep 13 Labs 

[LUMCAT] 

27599-3175 

[LUMINAIRE] 

Portable dry-cell searchlight. Includes 

[MORE] 

mounting brackets for vacuum cleaner 

[LAMPCAT] 

MST-3K 

[LAMP] 

Gypsy headmount 

[OTHER] 

Not terribly bright, but reliable 

[BLOCK] 


[LUMCAT] 

94303 

[LUMINAIRE] 

Entertainment spot 

[ENDBLOCK] 


TILT=INCLUDE 


1 

5 

0 30 90 120 180 

1.0 0.95 0.92 0.75 0.65 
1 10000 1.0 
3 5 1 
2 

.4 .8 .5 
1.0 1.0 6500 
0 45 90 
0 20 40 60 90 
10000 8000 7000 5500 4000 
9000 6500 4500 4000 3000 
4000 1500 800 500 200 


MOUKI M2 

An example file in the IES standard. 


We will use the same notational conventions as in the section on the IES standard. 
The main difference is that the CIE standard keywords are four uppercase letters that 
are not enclosed in square brackets, and may contain arbitrary spaces and lowercase 
characters. So the keyword NLPS may be written as Number of LamPS or even as 
aaNbbLcc dd eeP Sf f, and it will still be recognized as NLPS. 

Lines may be no longer than seventy-eight characters each. 
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Required identifier 

CIEF=CIE File Format, Version 1.0 (CIE Publication 102-1-993) 4-^ 


information line 4-* 
information line 4-* 

0 to 60 free-format information lines. 

information line 4-^ 


IDtWL=identification-number (A) 4-* 

Required identifier. 

keyword-data 4^> 
keyword=data 4^ 

keyword-data 

The measurement block. 

PHOT=INCLUDE 4-^ 
l?HOT=filename 4-^ 

Choose only one. 

information line 4-* 
information line 4-* 

0 to 60 free-format information lines. 

information line 4-* 


PTYP =type 4-* 

Required keyline. 

keyword=data 4-^> 
keyword-data 4-* 

The photometry block. 

keyword=data 4-* 


CONA -cone angles 


FIGURE P.13 

Main CIE file format. 


i.4.1 Jhm Mala Block 

The general format of the CIE file standard is shown in Figure E13. 

The file must begin with the keyline that identifies the file type, version, and the 
CIE publication that specifies the standard; this line must appear exactly as shown. 
Then comes the information block , which contain any information at all. This block 
may contain from zero to sixty free-format text lines. The lines should not begin 
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with anything that might be confused with the first line of the measurement block; 
simply avoiding any equal signs (=) will make sure that you’re safe. 

The information block is followed by the measurement block . The signal to 
the system that the measurement block has begun is the appearance of the IDNM 
keyword (possibly including lowercase letters and blanks). 

At the end of the measurement block comes the photometry block , which may 
immediately follow in the same file or be included in a different file that is pointed 
to by name. 


F.4.2 Tht Measurement Block 

The measurement block begins with the IDNM keyword, but may then contain a 
variety of other keywords and data. None of them but IDNM is mandatory. Each 
keyword is followed by an equals sign (=) and then its associated data. We list 
those keywords in Figure F.14 by their four-letter codes and their recommended 
expansions, including lowercase letters and blanks. 

The rotation of the luminaire is given by the TIME and ROME keywords, illustrated 
in Figure F.15. 

The numeric codes associated with the luminaire shapes used for LSHP are the 
following, illustrated in Figure F.16: 

1 A sphere. 

2 A half-sphere in the A\ direction. 

3 A cylinder parallel to A\. 

4 A cylinder parallel to A 2 . 

5 A half-cylinder cylinder parallel to A 2 , round half toward A\. 

6 A half-cylinder cylinder parallel to A 3 , round half toward A\. 

7 A rectangle with long side perpendicular to A\. 

S A rectangle with long side parallel to A\. 

9 Anything else. 

The measurement block can also provide for the fact that as you look at the 
luminaire from different angles, different amounts of the luminous opening (through 
which light escapes) are visible, due to blockages by the housing. This can be 

encoded by a number of lines that give the area visible in square meters from a 

variety of angles. Using the keyword NLAV provides the number of views, and then 
the keywords LA01, LAO 2, and so on provide the view information. Each line 
contains the area in square meters, and the angles (0 , ip) for that view. There is a 
maximum of ninety-nine views. 
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Key 

Expansion 

Argument 

Meaning 

IDNM 

IDentification NuMber 

M) 

An arbitrary alphanumeric string that 
identifies the particular test trial or oth¬ 
erwise gives information particular to this 
file. 

LUMN 

LUminaire Name 

M) 

The name of the luminaire. 

LAMP 

LAMP name 

M) 

The name of the lamp. 

NLPS 

Number of LamPS 

( z ) 

The number of lamps in the luminaire. 

TOLU 

TOtal LUmens 

m 

Total lumens generated by the luminaire. 

LLGE 

Lamp Luminaire GEometry 

(Z) e [i,4] 

The relationship of the lamp to the hous¬ 
ing, as in Figure F. 8 . The additional value 
4 is defined for a lamp that cannot be 
replaced. 

BLID 

BaLlast IDentification 

M) 

The name of the ballast. 

INPW 

INput Power 

m 

The total input power in watts. 

INVO 

INput voltage 

m 

The voltage for which the luminaire is 
rat^d. 

INVA 

INput Volt Amperes 

m 

The total volt ampere requirement of the 
luminaire. 

TLME 

TiLt during MEasurement 

m 

The degrees of tilt during photometry. 
This is rotation of the luminaire about the 
A 2 axis, as in Figure F.15. 

TLNM 

TiLt NorMal 

(R) 

The amount of tilt around the A 2 axis that 
is normal when installed. 

ROME 

Rotation during 

MEasurement 

m 

The degrees of rotation during photome¬ 
try. This is rotation of the luminaire about 
the A 3 axis, as in Figure F.15. 

LSHP 

Luminaire SHaPe 

(Z) €[1,9] 

This is one of the nine shape codes illus¬ 
trated in Figure F.16. 

NLAV 

Number of Luminous Area 
Views 

(Z) 

The number of views (see text). 

LAnn 

Luminous Area view nn 

(r, r,r) 

(A, 0, ip) (see text) 


MOURI M4 

Keywords for the CIE measurement block. 
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MOUII P.15 

The TLME and ROME tilts measure rotation of the luminaire, (a) The A 2 axis for TLME. (b) The A 3 
axis for ROME. 











HOUR! M6 

The luminaire shape codes for LSHP. (a) Code 1. (b) Code 2. (c) Code 3. (d) Code 4. (e) Code 5. 
(f) Code 6. (g) Code 7. (h) Code 8. (i) Code 9. 
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Key 

Expansion 

Argument 

Meaning 

PTYP 

Photometric TYPe 

(A) e [A,B,C] 

The goniometric configuration. 

APOS 

Angle Position 

(*4) 

Polar positioning code (see text). 

LUBA 

LUmen BAsis of 
photometry 

(U) 

Scale factor for lumens specifications. A 
value of —1 indicates they are absolute 
value. 

MULT 

MULTiplier 

m 

A non-negative multiplying factor applied 
to all intensity values. Normally 1.0. 

BAFA 

BAllast FActor 

m 

A non-negative multiplying factor applied 
to all intensity values in the file, intended 
to compensate for changes in ballast. Nor¬ 
mally 1.0. 

NCON 

Number of CONe angles 

(Z) 

Number of cone angles in the data below. 

NPLA 

Number of PLAne angles 

(Z) 

Number of plane angles in the data below. 

CONA 

CONe Angles 


The cone angles (see text). 


FIOURI P.17 

Keys for the CIE photometry block. 


P.4.3 The P hoton t ry Blo«k 

After the measurement block comes the photometry block. It is introduced with the 
keyword PHOT (or an expansion including spaces and lowercase letters). There are 
two choices of arguments to PHOT: a filename or the string INCLUDE. If a filename 
is given, then the photometry block is read from that file. If the string is INCLUDE, 
then the photometric data starts immediately. 

If the photometry block is coming from a file, then that file must begin with the 
header line: 

CIEA=CIE-A File Format, Version 1.0 (CIE Publication 102-1993) 
Then the block follows immediately. 

Like the measurement block, the photometry block contains a set of keywords 
and data. In this block the term half-plane angle is used to refer to the angle 6 around 
the equator, and the term cone angle is used to refer to the polar angle xp. The terms 
for this block are given in Figure F.17. 

The photometry block may begin with up to sixty information lines, like the main 
block. It must begin with the PTYP key to signify the start of the block. 

The APOS line serves to orient the equator of the sphere in which the measure¬ 
ments are made. When the luminaire has a set of three well-defined axes as in 
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Figure F.l, then orientation is no problem in any of the goniometric systems. The 
right axis is used to select the poles, and the other two span the hemisphere with one 
of the axes serving to select the 0° origin. 

However, even when the luminaire doesn’t have such natural axes, the sphere 
must still be oriented. The CIE has defined a wealth of different geometry codes to 
cover a variety of special-purpose configurations; they’re all listed in the standard 
[434]. Most luminaires used in computer graphics can be described with respect to 
one of the three default coordinate systems, and do not need an additional APOS 
specifier. 

The photometry block must end with the CONA keyline. This lists the values of 
where measurements were made, in ascending order. For type C configurations the 
first value must be 0. After the CONA key and its arguments come the photometric 
data. Each line begins with a value of 0, and then lists the candlepower for the 
luminaire at each value of rfr given on the CONA line. Then comes a newline followed 
by the next value of 0 , and so on. The values of 0 must be given in ascending order. 
Notice that this format is rather different than the IES format, though it would not 
be hard to change one into the other. Values may be separated by commas or any 
white space, including newlines. 

Figure F.l8 presents an arbitrary example. This example uses only some of the 
keywords. Although this is based on a real floodlight, this example contains nonsense 
data that is only intended to demonstrate the format. 
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CIEF=CIE File Format, Version 1.0 (CIE Publication 102-1993) 

Example of the CIE File Format using long names 

Floodlight Luminaire based on CIE Example file 2 

Luminaire made in Moebius's Garage 

IDentification NuMber=Catalog 3141 

LUMinaire Name=Floodgate V5 

LAMP name = 200 Watt SuperFlood 

Number of LamPS = 1 

Lamp Luminaire GEometry = 2 

BaLlast IDentification = G403 

TOtal LUmens = 17500 

INput Volt Amperes=190 

Luminaire SHaPe = 8 

TiLt during MEasurement = 0 

Number of Luminous Area Views = 0 

PHOTometric file = INCLUDE 

Mounting is standard shell WA2YHJ 

Used in top half of the Whirlitzer of Wonder 

Cabinet finish is walnut 

Photometric TYPe = B 

Angle Position Code = B3 

LUmen BAsis of photometry = 100 

MULTiplier = 1 

BAllast FActor=l 

Number of CONe angles = 3 

Number of PLane Angles = 4 


CONe Angles 

= 0.0 

10.0 

20.0 

25.0 

-15 

100 

95 

50 

13 

-5 

110 

102 

63 

20 

0 

90 

80 

44 

8 

10 

70 

50 

24 

3 


FI O U ft I F.lft 

An example file in the CIE Standard. 






The first truth is the form. You must put into 
your drawing most forcefully the facts which 
you know to be true rather than what you see. 
What you see, the impression a thing makes on 
the eye, will take care of itself—in fact, much of 
the time it is far too insistent. You cannot 
truthfully portray vision without a knowledge 
of the facts which underlie it. 

Kim on Nicolaides 

(“The Natural Way To Draw,” 1941) 



REFERENCE 


DATA 


This appendix gathers together useful spectral and material data for humans, ma¬ 
terials, and light sources. The data has come from a variety of sources, which are 
indicated in the text for each section. To make the data immediately useful, I have 
presented it here in 5-nanometer increments from [380,775] nanometers, for a total 
of 80 values per curve. Sometimes this has meant interpolating the published data; I 
reconstructed the data with a hermite cubic spline and point-sampled it to get these 
values. I didn’t extrapolate the data beyond its measured endpoints, since for some 
of the data such extrapolation would have yielded values that would have dwarfed 
the rest of the curve. Therefore I instead simply used the endpoint values when out 
of range. 

Morgan Kaufmann Publishers and the creators of this data have agreed to release 
an electronic form of this appendix to the public domain, so you don’t need to type 
it all in to use it. As of publication, all the data in this appendix is available on the 
anonymous ftp server ftp.cs.princeton.edu (internet address 128.112.92.1), 
under the directory pub/people/ps/glassner. Please feel free to download the 
data, use it, and share it. You may also repost or redistribute the information in this 
appendix freely. This release to the public domain applies to this appendix only, and 
to no other part of this book. 
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0.1 Material Data 

Table G.l provides some indices of refraction for a variety of materials at normal 
incidence. The data is from Fowles [148] and Wood [488]. 

The complex indices of refraction for four different conductors are given in Tables 
G.2 and G.3. The tables were interpolated from nonuniform data presented in Palik 
[329]. The data is presented graphically in Figures G.l and G.2. 
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Optically isotropic crystals 
and materials 

Simple index 77 


Perfect vacuum 

1 


Air 

1.0003 


Water 

1.33 


Isopropyl alcohol 

1.38 


Red garnet 

1.86 


Zinc sulfide 

2.36 


Sodium chloride 

1.544 


Diamond 

2.417 


Fluorite 

1.392 


Uniaxial positive 
crystals 

Ordinary index rjo 

Extraordinary index r) E 

Ice 

1.309 

1.310 

Quartz 

1.544 

1.553 

Zircon 

1.923 

1.968 

Rutile 

2.616 

2.903 

Uniaxial negative 
crystals 

Ordinary index 770 

Extraordinary index tj e 

Beryl 

1.598 

1.590 

Sodium nitrate 

1.587 

1.336 

Calcite 

1.659 

1.486 

Tourmaline 

1.669 

2.638 

Biaxial crystals 

m 

m 


Gypsum 

1.520 

1.523 

1.530 

Feldspar 

1.522 

1.526 

1.530 

Mica 

1.552 

1.582 

1.588 

Topaz 

1.619 

2.620 

1.627 


TABU 0.1 

Some indices of refraction at normal incidence. Frequency of measurement not available. Source: 
Data from Fowles, Introduction to Modem Optics , table 6.1, p. 176, and Wood, Crystals and 
Light, table 12-1, p. 114. 
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Aluminum 
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Index of refraction for aluminum, silver; copper, and gold. 



pi oufti o.a 

Extinction coefficients for aluminum, silver; copper; and gold. 















Aluminum 

Aluminum 

Silver 

Silver 

A 





A 

VW 

>c(A) 


k(A) 

380 

0.459522 

4.712543 

0.172643 

1.801357 

580 

1.095940 

7.034889 

0.119176 

3.580860 

385 

0.459522 

4.712543 

0.172643 

1.801357 

585 

1.123162 

7.092927 

0.119927 

3.620832 

390 

0.464578 

4.737463 

0.172179 

1.824736 

590 

1.150000 

7.150000 

0.121000 

3.660000 

395 

0.477268 

4.799026 

0.172143 

1.886179 

595 

1.176546 

7.206569 

0.122437 

3.697000 

400 

0.490000 

4.860000 

0.173000 


fliil 

1.200000 

7.260000 

0.124046 

3.733363 

405 

0.502732 

4.920975 

0.173857 


El 

1.220000 

7.310000 

0.125766 

3.769467 

410 

0.515422 

4.982537 

0.173821 

2.075264 

610 

1.243866 

7.364944 

0.127538 

3.805689 

415 

0.528027 

5.046061 

0.171576 


615 

1.271220 

7.422566 

0.129303 

3.842407 

420 

0.540540 

5.111395 

0.167032 

2.183605 

620 

1.300000 

7.480000 

0.131000 

3.880000 

425 

0.553012 

5.175306 

0.162298 

2.231833 

625 

1.328330 

7.535376 

0.132869 

3.919219 

430 

0.565131 

5.233723 

0.159271 

2.278383 

630 

1.356345 

7.588417 

0.134590 

3.959434 

435 

0.577244 

5.288472 

0.158423 

2.324504 

635 

1.384367 

7.639844 

0.136141 

4.000389 

440 

0.589948 

5.344414 

0.157828 

2.371326 

640 

1.411292 

7.690237 

0.137502 

4.041828 

445 

0.603591 

5.405324 

0.155916 

2.420021 

645 

1.439278 

7.740152 

0.138652 

4.083495 

450 

0.618000 

5.470000 

0.152160 

2.470635 

650 

1.470000 

7.790000 

0.139570 

4.125135 

455 

0.632493 

5.531815 

0.147613 

2.520932 

655 

1.502859 

7.841236 

0.140110 

4.166709 

460 

0.646781 

5.592108 

0.143045 

2.569545 

660 

1.534670 

7.896558 

0.140229 

4.208175 

465 

0.660667 

5.653007 

0.138757 

2.615768 

665 

1.566741 

7.953478 

0.140185 

4.249203 

470 

0.674693 

5.714261 

0.135317 

2.660014 

670 

1.600000 

8.010000 

0.140050 

4.289791 

475 

0.689087 

5.775548 

0.132764 

2.702999 

675 

1.634324 

8.066238 

0.139897 

4.329939 

480 

0.703759 

5.836021 

0.131138 

2.744957 

680 

1.670565 

8.119970 

0.139801 

4.369645 

485 

0.718931 

5.896012 

0.130306 

2.786577 

685 

1.708526 

8.171080 

0.139833 

4.408909 

490 

0.734853 

5.956376 

0.130009 

2.828568 

690 

1.747843 

8.219450 

0.140154 

4.447676 

495 

0.751561 

6.017607 

0.129992 

2.871323 

695 

1.787991 

8.265363 

0.141061 

4.485809 

500 

0.769000 

6.080000 

0.130061 

2.916154 

700 

1.830000 

8.310000 

0.142139 

4.523594 

505 

0.786605 

6.138476 

0.130105 

2.961894 

705 

1.872679 

8.354383 

0.143315 

4.561111 

510 

0.786342 

6.196043 

0.130103 

3.007563 

710 

1.919912 

8.400550 

0.144516 

4.598441 

515 

0.784987 

6.255528 

0.130043 

3.052436 

715 

1.972871 

8.452595 

0.145667 

4.635663 

520 

0.806091 

6.318217 

0.130125 

3.095771 

720 

2.030324 

8.501185 

0.146695 

4.672859 

525 

0.844815 

6.382319 

0.130219 

3.137532 

725 

2.090685 

8.543206 

0.147527 

4.710108 

530 

0.878276 

6.443884 

0.130091 

3.178189 

730 

2.153505 

8.575872 

0.148028 

4.747403 

535 

0.898602 

6.502451 

0.129646 

3.218184 

735 

2.220514 

8.598589 

0.147986 

4.784542 

540 

0.915377 

6.563588 

0.128675 

3.257945 

740 

2.285233 

8.611949 

0.147682 

4.821890 

545 

0.934823 

6.631030 

0.126824 

3.297745 

745 

2.345712 

8.618302 

0.147174 

4.859458 

550 

0.958000 

6.690000 

0.124775 

3.337667 

750 

2.400000 

8.620000 

0.146517 

4.897254 

555 

0.981445 

6.743231 

0.122763 

3.377699 

755 

2.450850 

8.623612 

0.145769 

4.935289 

560 

1.002914 

6.801953 

0.121029 

3.417830 

760 

2.499693 

8.625132 

0.144987 

4.973571 

565 

1.024224 

6.861834 

0.119768 

3.458226 

765 

2.545931 

8.622376 

0.144227 

5.012111 

570 

1.046286 

6.919963 

0.119005 

3.499347 

770 

2.589417 

8.614337 

0.143545 

5.050918 

575 

1.070014 

6.977149 

0.118838 

3.540295 

775 

2.630000 

8.600000 

0.143000 

5.090000 


TABLI 0.2 

Indices of refraction and extinction for aluminum and silver. 


















A 

Copper 

T)W 

Copper 

k(A) 

Gold 

vW 

Gold 

k(A) 

A 

Copper 

vW 

Copper 

k(A) 

Gold 

vW 

Gold 

k(A) 

380 

1.188280 

2.078662 

1.678455 

1.953596 

580 

0.595384 

2.703188 

0.259998 

2.909502 

385 

1.188280 

2.078662 

1.678455 

1.953596 

585 

0.527793 

2.753184 

0.247300 

2.911371 

390 

1.187100 

2.093040 

1.675263 

1.953178 

590 

0.468000 

2.810000 

0.236000 

2.911390 

395 

1.185312 

2.117618 

1.666817 

1.953892 

595 

0.415458 

2.875556 

0.226102 

2.910510 

400 

1.183702 

2.142816 

1.658000 

1.956000 

600 

0.372648 

2.945518 

0.217697 

2.909684 

405 

1.182223 

2.168463 

1.649184 

1.958108 

605 

0.338234 

3.018414 

0.210535 

2.909863 

410 

1.180822 

2.194386 

1.640737 

1.958822 

610 

0.310878 

3.092775 

0.204364 

2.912000 

415 

1.179452 

2.220412 

1.634183 

1.957185 

615 

0.289246 

3.167127 

0.198936 

2.917045 

420 

1.178062 

2.246369 

1.629351 

1.952822 

620 

0.272000 

3.240000 

0.194000 

2.925952 

425 

1.176602 

2.272084 

1.622275 

1.945549 

625 

0.256103 

3.312184 

0.188606 

2.939671 

430 

1.175023 

2.297387 

1.612759 

1.936360 

630 

0.243055 

3.381591 

0.183521 

2.959155 

435 

1.173274 

2.322103 

1.599890 

1.925744 

635 

0.232653 

3.448394 

0.178812 

2.985355 

440 

1.171306 

2.346061 

1.579005 

1.912865 

640 

0.224696 

3.512766 

0.174544 

3.019223 

445 

1.169114 

2.369725 

1.548932 

1.897096 

645 

0.218982 

3.574880 

0.170786 

3.061712 

450 

1.166702 

2.393291 

1.510154 

1.878760 

650 

0.215308 

3.634911 

0.167605 

3.113773 

455 

1.164034 

2.415747 

1.464924 

1.860113 

655 

0.213646 

3.693233 

0.165016 

3.179622 

460 

1.161139 

2.437040 

1.417717 

1.841519 

660 

0.213653 

3.750162 

0.163000 

3.260394 

465 

1.158046 

2.457114 

1.374532 

1.821149 

665 

0.214350 

3.805661 

0.161554 

3.348767 

470 

1.154787 

2.475917 

1.325871 

1.805653 

670 

0.215000 

3.860000 

0.160597 

3.442244 

475 

1.151390 

2.493393 

1.268297 

1.797055 

675 

0.214749 

3.914480 

0.160049 

3.538330 

480 

1.150639 

2.511981 

1.195681 

1.790414 

680 

0.214074 

3.967796 

0.159829 

3.634528 

485 

1.151381 

2.530700 

1.110843 

1.790218 

685 

0.213330 

4.019675 

0.159855 

3.728342 

490 

1.151354 

2.547674 

1.021396 

1.803380 

690 

0.212720 

4.069952 

0.160027 

3.817474 

495 

1.150138 

2.562687 

0.933039 

1.832168 

695 

0.212380 

4.118678 

0.160222 

3.900933 

500 

1.147313 

2.575524 

0.846880 

1.875281 

700 

0.212566 

4.166057 

0.160515 

3.978438 

505 

1.142456 

2.585970 

0.767504 

1.933520 

705 

0.213295 

4.212465 

0.160903 

4.050759 

510 

1.135149 

2.593811 

0.695794 

2.004628 

710 

0.214605 

4.258351 

0.161383 

4.118664 

515 

1.124970 

2.598830 

0.631604 

2.085559 

715 

0.216450 

4.303980 

0.161953 

4.182923 

520 

1.114585 

2.599943 

0.571663 

2.177679 

720 

0.218648 

4.349268 

0.162609 

4.244306 

525 

1.102313 

2.598045 

0.516924 

2.276178 

725 

0.221044 

4.394238 

0.163349 

4.303581 

530 

1.085276 

2.594923 

0.469422 

2.374557 

730 

0.223483 

4.438778 

0.164163 

4.361628 

535 

1.062659 

2.591761 

0.429107 

2.469264 

735 

0.225852 

4.482510 

0.165023 

4.419346 

540 

1.034637 

2.587950 

0.395941 

2.557729 

740 

0.228145 

4.525993 

0.165954 

4.476388 

545 

1.003658 

2.579443 

0.369509 

2.640573 

745 

0.230367 

4.569252 

0.166952 

4.532793 

550 

0.965898 

2.575114 

0.348463 

2.714364 

750 

0.232522 

4.612311 

0.168010 

4.588597 

555 

0.921582 

2.576699 

0.331370 

2.779947 

755 

0.234613 

4.655197 

0.169122 

4.643842 

560 

0.870931 

2.585936 

0.316801 

2.838163 

760 

0.236647 

4.697934 

0.170283 

4.698567 

565 

0.812255 

2.603536 

0.303023 

2.883275 

765 

0.238625 

4.740548 

0.171487 

4.752808 

570 

0.740885 

2.627422 

0.288205 

2.896406 

770 

0.240554 

4.783062 

0.172728 

4.806607 

575 

0.667504 

2.660953 

0.273748 

2.904831 

775 

0.242437 

4.825503 

0.174000 

4.860000 


TABLI 0.3 

Indices of refraction and extinction for copper and gold. 
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HOUR! 0.3 

The CIE standard observer matching functions. 


0.2 Human Data 


Table G.4 tabulates the CIE standard observer matching functions; these functions 
are plotted in Figure G.3. This data is from Wyszecki and Stiles [489]. 

Table G.5 tabulates the SML cone response curves for a human being. The data 
in this table was interpolated from data collected from [370,730] nanometers in 1- 
nanometer increments in Brian WandelPs laboratory at Stanford University. The 
data is plotted in Figure G.4. 
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D 



2(A) 

V(A) 

A 

x(X) 

yW 

z(A) 

V(A) 

380 

0.0014 

0.0000 

0.0065 

0.0000 

580 

0.9163 

0.8700 

0.0017 

0.8700 

385 

0.0022 

0.0001 

0.0105 

0.0001 

585 

0.9786 

0.8163 

0.0014 

0.8163 

390 

0.0042 

0.0001 

0.0201 

0.0001 

590 

1.0263 

0.7570 

0.0011 

0.7570 

395 

0.0076 

0.0002 

0.0362 

0.0002 

595 

1.0567 

0.6949 

0.0010 

0.6949 

400 

0.0143 

0.0004 

0.0679 

0.0004 

600 

1.0622 

0.6310 

0.0008 

0.6310 

405 

0.0232 

0.0006 

0.1102 

0.0006 

605 

1.0456 

0.5668 

0.0006 

0.5668 

410 

0.0435 

0.0012 

0.2074 

0.0012 

610 

1.0026 

0.5030 

0.0003 

0.5030 

415 

0.0776 

0.0022 

0.3713 

0.0022 

615 

0.9384 

0.4412 

0.0002 

0.4412 

420 

0.1344 

0.0040 

0.6456 

0.0040 

620 

0.8544 

0.3810 

0.0002 

0.3810 

425 

0.2148 

0.0073 

1.0391 

0.0073 

625 

0.7514 

0.3210 

0.0001 

0.3210 

430 

0.2839 

0.0116 

1.3856 

0.0116 

630 

0.6424 

0.2650 

0.0000 

0.2650 

435 

0.3285 

0.0168 

1.6230 

0.0168 

635 

0.5419 

0.2170 

0.0000 

0.2170 

440 

0.3483 

0.0230 

1.7471 

0.0230 

640 

0.4479 

0.1750 

0.0000 

0.1750 

445 

0.3481 

0.0298 

1.7826 

0.0298 

645 

0.3608 

0.1382 

0.0000 

0.1382 

450 

0.3362 

0.0380 

1.7721 

0.0380 

650 

0.2835 

0.1070 

0.0000 

0.1070 

455 

0.3187 

0.0480 

1.7441 

0.0480 

655 

0.2187 

0.0816 

0.0000 

0.0816 

460 

0.2908 

0.0600 

1.6692 

0.0600 

660 

0.1649 

0.0610 

0.0000 

0.0610 

465 

0.2511 

0.0739 

1.5281 

0.0739 

665 

0.1212 

0.0446 

0.0000 

0.0446 

470 

0.1954 

0.0910 

1.2876 

0.0910 

670 

0.0874 

0.0320 

0.0000 

0.0320 

475 

0.1421 

0.1126 

1.0419 

0.1126 

675 

0.0636 

0.0232 

0.0000 

0.0232 

480 

0.0956 

0.1390 

0.8130 

0.1390 

680 

0.0468 

0.0170 

0.0000 

0.0170 

485 

0.0580 

0.1693 

0.6162 

0.1693 

685 

0.0329 

0.0119 

0.0000 

0.0119 

490 

0.0320 

0.2080 

0.4652 

0.2080 

690 

0.0227 

0.0082 

0.0000 

0.0082 

495 

0.0147 

0.2586 

0.3533 

0.2586 

695 

0.0158 

0.0057 

0.0000 

0.0057 

500 

0.0049 

0.3230 

0.2720 

0.3230 

700 

0.0114 

0.0041 

0.0000 

0.0041 

505 

0.0024 

0.4073 

0.2123 

0.4073 

705 

0.0081 

0.0029 

0.0000 

0.0029 

510 

0.0093 

0.5030 

0.1582 

0.5030 

710 

0.0058 

0.0021 

0.0000 

0.0021 

515 

0.0291 

0.6082 

0.1117 

0.6082 

715 

0.0041 

0.0015 

0.0000 

0.0015 

520 

0.0633 

0.7100 

0.0782 

0.7100 

720 

0.0029 

0.0010 

0.0000 

0.0010 

525 

0.1096 

0.7932 

0.0573 

0.7932 

725 

0.0020 

0.0007 

0.0000 

0.0007 

530 

0.1655 

0.8620 

0.0422 

0.8620 

730 

0.0014 

0.0005 

0.0000 

0.0005 

535 

0.2257 

0.9149 

0.0298 

0.9149 

735 

0.0010 

0.0004 

0.0000 

0.0004 

540 

0.2904 

0.9540 

0.0203 

0.9540 

740 

0.0007 

0.0003 

0.0000 

0.0003 

545 

0.3597 

0.9803 

0.0134 

0.9803 

745 

0.0005 

0.0002 

0.0000 

0.0002 

550 

0.4334 

0.9950 

0.0087 

0.9950 

750 

0.0003 

0.0001 

0.0000 

0.0001 

555 

0.5121 

1.0002 

0.0057 

1.0002 

755 

0.0002 

0.0001 

0.0000 

0.0001 

560 

0.5945 

0.9950 

0.0039 

0.9950 

760 

0.0002 

0.0001 

0.0000 

0.0001 

565 

0.6784 

0.9786 

0.0027 

0.9786 

765 

0.0001 

0.0000 

0.0000 

0.0000 

570 

0.7621 

0.9520 

0.0021 

0.9520 

770 

0.0001 

0.0000 

0.0000 

0.0000 

575 

0.8425 

0.9154 

0.0018 

0.9154 

775 

0.0000 

0.0000 

0.0000 

0.0000 


TABLI 0.4 

CIE color-matching and spectral efficiency functions. 
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A 

5(A) 

M(A) 

H A) 

A 

5(A) 

MW 

L( A) 

380 

0.000360 

0.000221 

0.000179 

580 

0.000026 

0.255411 

0.614555 

385 

0.000729 

0.000338 

0.000415 

585 

0.000022 

0.221423 

0.594825 

390 

0.001487 

0.000606 

0.000893 

590 

0.000018 

0.187233 

0.569736 

395 

0.002778 

0.001100 

0.001690 

595 

0.000014 

0.154635 

0.540161 

400 

0.004501 

0.001774 

0.002726 

600 

0.000011 

0.124709 

0.506265 

405 

0.006591 

0.002605 

0.003945 

605 

0.000008 

0.098280 

0.468240 

410 

0.009383 

0.003766 

0.005533 

610 

0.000005 

0.075870 

0.427109 

415 

0.013095 

0.005416 

0.007643 

615 

0.000004 

0.057719 

0.383763 

420 

0.017080 

0.007449 

0.010050 

620 

0.000003 

0.043249 

0.337736 

425 

0.020592 

0.009746 

0.012488 

625 

0.000003 

0.031762 

0.289181 

430 

0.023358 

0.012406 

0.014893 

630 

0.000002 

0.022909 

0.242081 

435 

0.025198 

0.015492 

0.017205 

635 

0.000001 

0.016373 

0.200298 

440 

0.025831 

0.018718 

0.019180 

640 

0.000000 

0.011623 

0.163370 

445 

0.025129 

0.021815 

0.020640 

645 

0.000000 

0.008138 

0.130186 

450 

0.023665 

0.024936 

0.021862 

650 

0.000000 

0.005644 

0.101351 

455 

0.022095 

0.028517 

0.023380 

655 

0.000000 

0.003927 

0.077573 

460 

0.020711 

0.033695 

0.026303 

660 

0.000000 

0.002750 

0.058247 

465 

0.019540 

0.041407 

0.031702 

665 

0.000000 

0.001904 

0.042625 

470 

0.017902 

0.051084 

0.039913 

670 

0.000000 

0.001308 

0.030691 

475 

0.015234 

0.061967 

0.051034 

675 

0.000000 

0.000911 

0.022336 

480 

0.012144 

0.074044 

0.064951 

680 

0.000000 

0.000645 

0.016354 

485 

0.009393 

0.087773 

0.081841 

685 

0.000000 

0.000450 

0.011553 

490 

0.007173 

0.104748 

0.103244 

690 

0.000000 

0.000302 

0.007898 

495 

0.005500 

0.126791 

0.131089 

695 

0.000000 

0.000192 

0.005509 

500 

0.004252 

0.155503 

0.167484 

700 

0.000000 

0.000120 

0.003980 

505 

0.003281 

0.191718 

0.213959 

705 

0.000000 

0.000085 

0.002859 

510 

0.002478 

0.233411 

0.269568 

710 

0.000000 

0.000075 

0.002025 

515 

0.001776 

0.277183 

0.331625 

715 

0.000000 

0.000075 

0.001438 

520 

0.001227 

0.316997 

0.392975 

720 

0.000000 

0.000068 

0.001032 

525 

0.000882 

0.347571 

0.446875 

725 

0.000000 

0.000040 

0.000740 

530 

0.000662 

0.369273 

0.492693 

730 

0.000000 

0.000000 

0.000504 

535 

0.000476 

0.383621 

0.531225 

735 

0.000000 

0.000000 

0.000504 

540 

0.000322 

0.391089 

0.562873 

740 

0.000000 

0.000000 

0.000504 

545 

0.000214 

0.392001 

0.588075 

745 

0.000000 

0.000000 

0.000504 

550 

0.000142 

0.387142 

0.607818 

750 

0.000000 

0.000000 

0.000504 

555 

0.000094 

0.377206 

0.622910 

755 

0.000000 

0.000000 

0.000504 

560 

0.000063 

0.362065 

0.632895 

760 

0.000000 

0.000000 

0.000504 

565 

0.000043 

0.341585 

0.637137 

765 

0.000000 

0.000000 

0.000504 

570 

0.000032 

0.316419 

0.635543 

770 

0.000000 

0.000000 

0.000504 

575 

0.000028 

0.287417 

0.628100 

775 

0.000000 

0.000000 

0.000504 


TABLI 0.5 

SML cone response curves. 
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MOURI 0.4 

The SML cone response curves. 


©•3 light Sources 

Table G.6 presents the reference curves for CIE standard illuminants A, B y and 
C. Table G.7 provides data for a cool-white fluorescent bulb and CIE standard 
illuminant D65. The values in these figures were interpolated from data collected 
from [370,730] nanometers in 1-nanometer increments in Brian Wandell’s laboratory 
at Stanford University. Table G.8 gives the curves for the three CIE standard daylight 
illuminants. The data for Tables G.6, G.7, and G.8 are plotted in Figures G.5, G.6, 
and G.7. 
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CIE standard illuminants A, B , and C. 



FI 0 II ft I 0.0 

A cool-white fluorescent light bulb and CIE standard illuminant D65. 
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G REFERENCE DATA 


A 

A(\) 

B{ A) 

C( A) 

A 

AM 


CM 

380 

0.045345 

0.103646 

0.152693 

580 

0.529521 

0.467333 

0.452526 

385 

0.050435 

0.124237 

0.184712 

585 

0.546363 

0.463030 

0.441560 

390 

0.055941 

0.144827 

0.219323 

590 

0.563252 

0.459004 

0.431242 

395 

0.061771 

0.167407 

0.255275 

595 

0.580187 

0.455488 

0.422080 

400 

0.068064 

0.191098 

0.292893 

600 

0.597076 

0.453452 

0.415047 

405 

0.074727 

0.215713 

0.332269 

605 

0.614011 

0.453822 

0.411022 

410 

0.081806 

0.241070 

0.372941 

610 

0.630900 

0.455765 

0.409032 

415 

0.089256 

0.266981 

0.414261 

615 

0.647742 

0.458356 

0.408060 

420 

0.097122 

0.292430 

0.453914 

620 

0.664538 

0.461318 

0.407644 

425 

0.105451 

0.316352 

0.489543 

625 

0.681288 

0.464372 

0.407459 

430 

0.114150 

0.338238 

0.520081 

630 

0.697946 

0.467333 

0.407181 

435 

0.123265 

0.357718 

0.544836 

635 

0.714510 

0.469924 

0.406533 

440 

0.132797 

0.373866 

0.562188 

640 

0.730983 

0.472885 

0.406256 

445 

0.142745 

0.386082 

0.571210 

645 

0.747363 

0.476818 

0.407135 

450 

0.153109 

0.395151 

0.573755 

650 

0.763604 

0.480751 

0.408107 

455 

0.163844 

0.401999 

0.571904 

655 

0.779706 

0.483944 

0.408107 

460 

0.174949 

0.408569 

0.569591 

660 

0.795669 

0.485841 

0.406718 

465 

0.186470 

0.416805 

0.570516 

665 

0.811494 

0.486211 

0.403572 

470 

0.198362 

0.425689 

0.572830 

670 

0.827179 

0.485378 

0.399315 

475 

0.210624 

0.433787 

0.574172 

675 

0.842680 

0.483759 

0.394688 

480 

0.223209 

0.440496 

0.573293 

680 

0.857996 

0.480751 

0.388673 

485 

0.236165 

0.445262 

0.568758 

685 

0.873126 

0.475847 

0.380391 

490 

0.249445 

0.442347 

0.558486 

690 

0.888071 

0.470109 

0.371090 

495 

0.263048 

0.442856 

0.540903 

695 

0.902832 

0.464464 

0.362021 

500 

0.276976 

0.435869 

0.518693 

700 

0.917361 

0.458542 

0.353045 

505 

0.291181 

0.427401 

0.495003 

705 

0.931705 

0.452064 

0.344068 

510 

0.305664 

0.419674 

0.473348 

710 

0.945817 

0.445123 

0.334999 

515 

0.320424 

0.414816 

0.457200 

715 

0.959698 

0.437720 

0.325745 

520 

0.335462 

0.414122 

0.448362 

720 

0.973348 

0.429854 

0.316028 

525 

0.350685 

0.418425 

0.447807 

725 

0.986813 

0.421525 

0.306774 

530 

0.366139 

0.426615 

0.453452 

730 

1.000000 

0.413659 

0.297983 

535 

0.381825 

0.437072 

0.462428 

735 

1.000000 

0.413659 

0.297983 

540 

0.397696 

0.448362 

0.472423 

740 

1.000000 

0.413659 

0.297983 

545 

0.413705 

0.458819 

0.480983 

745 

1.000000 

0.413659 

0.297983 

550 

0.429900 

0.467333 

0.486767 

750 

1.000000 

0.413659 

0.297983 

555 

0.446234 

0.472885 

0.488941 

755 

1.000000 

0.413659 

0.297983 

560 

0.462706 

0.475662 

0.487229 

760 

1.000000 

0.413659 

0.297983 

565 

0.479271 

0.476217 

0.481723 

765 

1.000000 

0.413659 

0.297983 

570 

0.495928 

0.474736 

0.473348 

770 

1.000000 

0.413659 

0.297983 

575 

0.512678 

0.471497 

0.463400 

775 

1.000000 

0.413659 

0.297983 


TABU 0.6 

CIE standard illuminants A , B , and C. 
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A 



ID 



380 

0.000000 

50.000000 

580 

0.996610 

95.800003 

385 

0.000000 

52.299999 

585 

0.948428 

92.250000 

390 

0.000000 

54.599998 

590 

0.879458 

88.699997 

395 

0.000000 

68.699997 

595 

0.826315 

89.349998 

400 

0.243051 

82.800003 

600 

0.778983 

90.000000 

405 

0.000000 

87.150002 

605 

0.721405 

89.800003 

410 

0.246102 

91.500000 

610 

0.660000 

89.599998 

415 

0.097401 

92.449997 

615 

0.603640 

88.650002 

420 

0.096203 

93.400002 

620 

0.548542 

87.699997 

425 

0.368037 

90.050003 

625 

0.488339 

85.500000 

430 

0.637017 

86.699997 

630 

0.418983 

83.300003 

435 

0.626603 

95.800003 

635 

0.341488 

83.500000 

440 

0.459051 

104.900002 

640 

0.274780 

83.699997 

445 

0.320830 

110.949997 

645 

0.234955 

81.849998 

450 

0.256475 

117.000000 

650 

0.208881 

80.000000 

455 

0.263880 

117.400002 

655 

0.180710 

80.099998 

460 

0.296339 

117.800003 

660 

0.152949 

80.199997 

465 

0.308912 

116.349998 

665 

0.131028 

81.250000 

470 

0.308339 

114.900002 

670 

0.113695 

82.300003 

475 

0.310210 

115.400002 

675 

0.098593 

80.300003 

480 

0.313830 

115.900002 

680 

0.085627 

78.300003 

485 

0.315258 

112.349998 

685 

0.075082 

74.000000 

490 

0.313830 

108.800003 

690 

0.066508 

69.699997 

495 

0.309960 

109.099998 

695 

0.059265 

70.650002 

500 

0.305085 

109.400002 

700 

0.052678 

71.599998 

505 

0.301570 

108.599998 

705 

0.000000 

72.949997 

510 

0.304475 

107.800003 

710 

0.000000 

74.300003 

515 

0.315184 

106.300003 

715 

0.000000 

67.949997 

520 

0.317695 

104.800003 

720 

0.000000 

61.599998 

525 

0.309516 

106.250000 

725 

0.000000 

65.750000 

530 

0.359593 

107.699997 

730 

0.000000 

69.900002 

535 

0.518761 

106.050003 

735 

0.000000 

69.900002 

540 

0.693966 

104.400002 

740 

0.000000 

69.900002 

545 

0.784533 

104.199997 

745 

0.000000 

69.900002 

550 

0.803186 

104.000000 

750 

0.000000 

69.900002 

555 

0.792184 

102.000000 

755 

0.000000 

69.900002 

560 

0.798508 

100.000000 

760 

0.000000 

69.900002 

565 

0.854789 

98.150002 

765 

0.000000 

69.900002 

570 

0.931525 

96.300003 

770 

0.000000 

69.900002 

575 

0.987479 

96.050003 

775 

0.000000 

69.900002 


TABLI 0.7 

Spectral curves for a cool-white fluorescent light bulb and a CIE standard D65 illuminant. 
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G REFERENCE DATA 


A 

Day 0(A) 

Day 1(A) 

Day 2(A) 

A 

Day 0(A) 

Day 1(A) 

Day 2(A) 

380 

63.4000 

38.5000 

3.0000 

580 

95.1000 

-3.5000 

0.5000 

385 

64.6000 

36.8000 

2.1000 

585 

92.1000 

-3.5000 

1.3000 

390 

65.8000 

35.0000 

1.2000 

590 

89.1000 

-3.5000 

2.1000 

395 

80.3000 

39.2000 

0.1000 

595 

89.8000 

-4.7000 

2.7000 

400 

94.8000 

43.4000 

-1.1000 

600 

90.5000 

-5.8000 

3.2000 

405 

99.8000 

44.9000 

-0.8000 

605 

90.4000 

-6.5000 

3.7000 

410 

104.8000 

46.3000 

-0.5000 

610 

90.3000 

-7.2000 

4.1000 

415 

105.4000 

45.1000 

-0.6000 

615 

89.4000 

-7.9000 

4.4000 

420 

105.9000 

43.9000 

-0.7000 

620 

88.4000 

-8.6000 

4.7000 

425 

101.4000 

40.5000 

-0.9000 

625 

86.2000 

-9.1000 

4.9000 

430 

96.8000 

37.1000 

-1.2000 

630 

84.0000 

-9.5000 

5.1000 

435 

105.4000 

36.9000 

-1.9000 

635 

84.6000 

-10.2000 

5.9000 

440 

113.9000 

36.7000 

-2.6000 

640 

85.1000 

-10.9000 

6.7000 

445 

119.8000 

36.3000 

-2.8000 

645 

83.5000 

-10.8000 

7.0000 

450 

125.6000 

35.9000 

-2.9000 

650 

81.9000 

-10.7000 

7.3000 

455 

125.6000 

34.3000 

-2.8000 

655 

82.3000 

-11.4000 

8.0000 

460 

125.5000 

32.6000 

-2.8000 

660 

82.6000 

-12.0000 

8.6000 

465 

123.4000 

30.3000 

-2.7000 

665 

83.8000 

-13.0000 

9.2000 

470 

121.3000 

27.9000 

-2.6000 

670 

84.9000 

-14.0000 

9.8000 

475 

121.3000 

26.1000 

-2.6000 

675 

83.1000 

-13.8000 

10.0000 

480 

121.3000 

24.3000 

-2.6000 

680 

81.3000 

-13.6000 

10.2000 

485 

117.4000 

22.2000 

-2.2000 

685 

76.6000 

-12.8000 

9.3000 

490 

113.5000 

20.1000 

-1.8000 

690 

71.9000 

-12.0000 

8.3000 

495 

113.3000 

18.2000 

-1.6000 

695 

73.1000 

-12.7000 

9.0000 

500 

113.1000 

16.2000 

-1.5000 

700 

74.3000 

-13.3000 

9.6000 

505 

112.0000 

14.7000 

-1.4000 

705 

75.4000 

-13.1000 

9.0000 

510 

110.8000 

13.2000 

-1.3000 

710 

76.4000 

-12.9000 

8.5000 

515 

108.7000 

10.9000 

-1.3000 

715 

69.9000 

-11.8000 

7.8000 

520 

106.5000 

8.6000 

-1.2000 

720 

63.3000 

-10.6000 

7.0000 

525 

107.7000 

7.4000 

-1.1000 

725 

67.5000 

-11.1000 

7.3000 

530 

108.8000 

6.1000 

- 1.0000 

730 

71.7000 

-11.6000 

7.6000 

535 

107.1000 

5.2000 

-0.8000 

735 

74.4000 

-11.9000 

7.8000 

540 

105.3000 

4.2000 

-0.5000 

740 

77.0000 

-12.2000 

8.0000 

545 

104.9000 

3.1000 

-0.4000 

745 

71.1000 

-11.2000 

7.4000 

550 

104.4000 

1.9000 

-0.3000 

750 

65.2000 

-10.2000 

6.7000 

555 

102.2000 

1.0000 

-0.2000 

755 

56.5000 

-9.0000 

6.0000 

560 

100.0000 

0.0000 

0.0000 

760 

47.7000 

-7.8000 

5.2000 

565 

98.0000 

-0.8000 

0.1000 

765 

58.2000 

-9.5000 

6.3000 

570 

96.0000 

-1.6000 

0.2000 

770 

68.6000 

-11.2000 

7.4000 

575 

95.6000 

-2.6000 

0.4000 

775 

66.8000 

-10.8000 

7.1000 


TAIL! O.B 

CIE daylight functions. 
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CIE daylight functions. 


0.4 Phosphors 

Table G.9 presents phosphor intensities measured off of a Hitachi monitor used in 
a Silicon Graphics workstation. Data represents milliwatts per steradian per square 
meter. These values were interpolated from data collected from [390,730] nanometers 
in 2-nanometer increments by Gary Meyer at the University of Oregon. They are 
plotted in Figure G.8. 







A 

*(A) 

G(A) 

B( A) 

A 

R{ A) 

GW 

B( A) 

380 

0.057931 

0.021883 

-0.048760 

580 

0.028820 

0.211200 

0.004303 

385 

0.005826 

0.003728 

0.007903 

585 

0.100981 

0.176914 

0.004480 

390 

0.001204 

0.001455 

0.021440 

590 

0.102800 

0.150600 

0.003878 

395 

0.000540 

0.001771 

0.030972 

595 

0.225702 

0.129410 

0.004720 

400 

0.001028 

0.001476 

0.046450 

600 

0.051410 

0.108400 

0.002474 

405 

0.000649 

0.001882 

0.067898 

605 

0.039469 

0.090768 

0.002581 

410 

0.000865 

0.003481 

0.102700 

610 

0.060840 

0.074860 

0.002130 

415 

0.001967 

0.004548 

0.151278 

615 

0.330900 

0.065217 

0.004349 

420 

0.003545 

0.006293 

0.212700 

620 

0.325500 

0.053390 

0.003388 

425 

0.005347 

0.007975 

0.282917 

625 

1.149734 

0.055354 

0.010921 

430 

0.005832 

0.010470 

0.354000 

630 

0.645900 

0.041820 

0.006136 

435 

0.006674 

0.013214 

0.418059 

635 

0.097025 

0.026902 

0.002018 

440 

0.007489 

0.014730 

0.459400 

640 

0.030390 

0.021230 

0.001263 

445 

0.008482 

0.017184 

0.482813 

645 

0.022732 

0.019673 

0.001652 

450 

0.007565 

0.020130 

0.502500 

650 

0.016370 

0.011780 

0.002494 

455 

0.007673 

0.022908 

0.501154 

655 

0.019276 

0.008516 

0.000842 

460 

0.007446 

0.027660 

0.464300 

660 

0.015280 

0.006605 

0.000483 

465 

0.010566 

0.033830 

0.421582 

665 

0.009565 

0.004455 

0.000729 

470 

0.016810 

0.044000 

0.380800 

670 

0.013110 

0.001612 

0.000722 

475 

0.011341 

0.059700 

0.330554 

675 

0.012118 

0.001597 

0.000532 

480 

0.006619 

0.081580 

0.277700 

680 

0.010250 

0.001264 

0.000576 

485 

0.007587 

0.111515 

0.229958 

685 

0.046663 

0.001264 

0.002560 

490 

0.012500 

0.149000 

0.188900 

690 

0.023140 

0.001264 

0.001289 

495 

0.019539 

0.194846 

0.152174 

695 

0.038423 

0.001264 

0.001632 

500 

0.011220 

0.247200 

0.123300 

700 

0.038330 

0.001264 

0.001866 

505 

0.005122 

0.299112 

0.098008 

705 

0.750223 

0.001264 

0.008367 

510 

0.014670 

0.348600 

0.076800 

710 

0.204300 

0.001264 

0.002788 

515 

0.027285 

0.386872 

0.060078 

715 

0.023084 

0.001264 

0.001704 

520 

0.007949 

0.422200 

0.047210 

720 

0.011860 

0.001264 

0.001128 

525 

0.007450 

0.451545 

0.037035 

725 

0.017744 

0.001264 

0.003022 

530 

0.009993 

0.470300 

0.030450 

730 

0.012540 

0.001264 

0.002210 

535 

0.023196 

0.468729 

0.024329 

735 

0.012540 

0.001264 

0.002210 

540 

0.080380 

0.456200 

0.020490 

740 

0.012540 

0.001264 

0.002210 

545 

0.021056 

0.441443 

0.016788 

745 

0.012540 

0.001264 

0.002210 

550 

0.012280 

0.427200 

0.013200 

750 

0.012540 

0.001264 

0.002210 

555 

0.035867 

0.406477 

0.011057 

755 

0.012540 

0.001264 

0.002210 

560 

0.012460 

0.373000 

0.008625 

760 

0.012540 

0.001264 

0.002210 

565 

0.016671 

0.334325 

0.006725 

765 

0.012540 

0.001264 

0.002210 

570 

0.013910 

0.288200 

0.006555 

770 

0.012540 

0.001264 

0.002210 

575 

0.011911 

0.246348 

0.004840 

775 

0.012540 

0.001264 

0.002210 


TABU 0.9 

Phosphor intensities measured off of a Hitachi monitor used in a Silicon Graphics workstation. 
Data represents milliwatts per steradian per square meter. 
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Hitachi monitor phosphor. 


G«5 Macbeth CelerCheckar 

The Macbeth ColorChecker is a printed piece of white board upon which has been 
printed a 4 x 6 grid of printed squares of different colors [291], It is a commercial 
product manufactured by Macbeth, a division of the Kollmorgen Corporation in 
New York. The ColorChecker chart is widely available in many art and photo 
supply stores. 

The colors were chosen to represent a variety of different naturally ocurring 
colors, and provide a convenient and easily accessible standard set of colors. The 
reasoning and design behind the chart is given in [291]. 

Table G.10 gives the chromaticity coordinates for the Macbeth ColorChecker 
squares. 

The Macbeth spectra were measured in Brian WandelPs laboratory at Stanford 
University in the range [380,720] nanometers in 1-nanometer steps. The 5-nanometer 
data is presented in Tables G.ll through G.16. The data is plotted in Figures G.9 
through G.15. 
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G REFERENCE 


DATA 


Name 

X 

y 

Y 

Dark skin 

0.4002 

0.3504 

10.05 

Light skin 

0.3773 

0.3446 

35.82 

Blue sky 

0.2470 

0.2514 

19.33 

Foliage 

0.3372 

0.4220 

13.29 

Blue flower 

0.2651 

0.2400 

24.27 

Bluish green 

0.2608 

0.3430 

43.06 

Orange 

0.5060 

0.4070 

30.05 

Purplish blue 

0.2110 

0.1750 

12.00 

Moderate red 

0.4533 

0.3058 

19.77 

Purple 

0.2845 

0.2020 

6.56 

Yellow green 

0.3800 

0.4887 

44.29 

Orange yellow 

0.4729 

0.4375 

43.06 

Blue 

0.1866 

0.1285 

6.11 

Green 

0.3046 

0.4782 

23.39 

Red 

0.5385 

0.3129 

12.00 

Yellow 

0.4480 

0.4703 

59.10 

Magenta 

0.3635 

0.2325 

19.77 

Cyan 

0.1958 

0.2519 

19.77 

White 

0.3101 

0.3163 

90.01 

Neutral 8 

0.3101 

0.3163 

59.10 

Neutral 6.5 

0.3101 

0.3163 

36.20 

Neutral 5 

0.3101 

0.3163 

19.77 

Neutral 3.5 

0.3101 

0.3163 

mm 

Black 

0.3101 

0.3163 

3.13 


TABLI 8*1O 

1931 CIE chromaticity coordinates for the Macbeth ColorChecker. 
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Spectral plots for Macbeth colors light skin and dark skin. 



MOURI 0.10 

Spectral plots for Macbeth colors blue sky, foliage, and blue flower. 
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FIOURI 0.11 

Spectral plots for Macbeth colors purple, purplish blue, and bluish green. 



FIOURI O# 1 % 

Spectral plots for Macbeth colors orange, orange-yellow, moderate red, and yellow-green. 





FI0URI 0.13 


Spectral plots for Macbeth colors yellow, cyan, and magenta. 



FI0URI 0*14 

Spectral plots for Macbeth colors red, green, and blue. 
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G 


REFERENCE DATA 



PI8URI 8.18 

Spectral plots for Macbeth colors black, neutral .35, neutral .5, neutral .65, neutral .8, and white. 




G.5 Macbeth ColorChecker 


1 185 


A 

Dark skin 

Light skin 

Blue sky 

Foliage 

A 

Dark skin 

Light skin 

Blue sky 

Foliage 

380 

3.885702 

16.089849 

17.943291 

2.546426 

580 

11.402650 

46.825020 

15.762640 

9.431731 

385 

4.450390 

17.960190 

22.671278 

2.610934 

585 

12.016553 

50.580948 

15.465015 

8.970447 

390 

4.833745 

19.966330 

27.405220 

2.542109 

590 

12.500280 

53.334690 

15.172400 

8.491978 

395 

5.165829 

21.059692 

31.227926 

2.640033 

595 

12.906229 

55.485752 

14.887200 

8.111174 

400 

5.140137 

21.424000 

33.491619 

2.668386 

600 

13.344550 

57.093788 

14.590760 

7.881492 

405 

5.004951 

21.630323 

34.459782 

2.670110 

605 

13.780877 

58.501759 

14.266936 

7.831820 

410 

4.745532 

21.410021 

34.664639 

2.679658 

610 

14.215940 

59.758968 

13.900530 

7.900640 

415 

4.505754 

21.685217 

34.934277 

2.752025 

615 

14.655576 

60.706512 

13.487477 

8.019826 

420 

4.267467 

21.972380 

35.077499 

2.810802 

620 

15.154200 

61.857552 

13.114290 

8.062486 

425 

4.049342 

22.389822 

35.222507 

2.941530 

625 

15.673985 

62.866501 

12.761962 

8.006430 

430 

3.852713 

22.755671 

34.938000 

3.015812 

630 

16.227119 

63.668449 

12.378260 

7.889882 

435 

3.654791 

23.196367 

34.497162 

3.144345 

635 

16.870535 

64.770386 

12.054096 

7.807934 

440 

3.559050 

23.811920 

34.155392 

3.318482 

640 

17.551279 

65.642593 

11.751030 

7.739443 

445 

3.557104 

24.944149 

34.030743 

3.446844 

645 

18.364779 

66.633049 

11.469259 

7.723470 

450 

3.543214 

26.212271 

33.742401 

3.607446 

650 

19.215750 

67.716698 

11.184400 

7.795821 

455 

3.553052 

27.517824 

33.172203 

3.751262 

655 

20.125881 

68.979088 

10.994964 

7.977539 

460 

3.582899 

28.846600 

32.299450 

3.861416 

660 

21.052549 

70.166290 

10.798680 

8.310620 

465 

3.641600 

30.132538 

31.374500 

3.987481 

665 

21.970407 

71.502991 

10.651037 

8.904679 

470 

3.750086 

31.160351 

30.467739 

4.128319 

670 

22.988800 

73.308052 

10.570910 

9.847201 

475 

3.924307 

32.032017 

29.688854 

4.297079 

675 

23.967045 

75.023354 

10.500649 

11.135474 

480 

4.143078 

32.780708 

28.912260 

4.532319 

680 

24.994640 

76.834427 

10.429630 

12.728550 

485 

4.340704 

33.241966 

28.007822 

4.811964 

685 

26.110291 

78.500015 

10.411583 

14.420553 

490 

4.516574 

33.585011 

27.065760 

5.311668 

690 

27.267071 

80.319321 

10.388900 

15.916030 

495 

4.697203 

34.209358 

26.075449 

6.278073 

695 

28.366152 

81.633400 

10.317711 

17.029913 

500 

4.884938 

34.834000 

25.025570 

7.983501 

700 

29.473770 

82.811432 

10.261740 

17.841150 

505 

5.145301 

35.022625 

24.017603 

10.442900 

705 

30.904243 

84.173866 

10.246243 

18.574108 

510 

5.452155 

34.222618 

23.100170 

13.258410 

710 

32.329899 

84.750160 

10.232450 

19.173149 

515 

5.723958 

32.859192 

22.320612 

15.591791 

715 

34.044037 

85.906219 

10.215766 

19.946556 

520 

5.934415 

31.460819 

21.590370 

16.911671 

720 

35.786400 

86.237663 

10.266010 

20.861300 

525 

6.041916 

30.508289 

20.959642 

17.141542 

725 

35.786400 

86.237663 

10.266010 

20.861300 

530 

6.115690 

30.355829 

20.435789 

16.550421 

730 

35.786400 

86.237663 

10.266010 

20.861300 

535 

6.196870 

30.720617 

19.948175 

15.435068 

735 

35.786400 

86.237663 

10.266010 

20.861300 

540 

6.343112 

31.166571 

19.510401 

14.199900 

740 

35.786400 

86.237663 

10.266010 

20.861300 

545 

6.560757 

31.045612 

19.001617 

12.963397 

745 

35.786400 

86.237663 

10.266010 

20.861300 

550 

6.932740 

30.636030 

18.471201 

11.802250 

750 

35.786400 

86.237663 

10.266010 

20.861300 

555 

7.463114 

30.617739 

17.922029 

10.926905 

755 

35.786400 

86.237663 

10.266010 

20.861300 

560 

8.179690 

31.725121 

17.381670 

10.582230 

760 

35.786400 

86.237663 

10.266010 

20.861300 

565 

8.991974 

34.118702 

16.938551 

10.437484 

765 

35.786400 

86.237663 

10.266010 

20.861300 

570 

9.840432 

37.883961 

16.540689 

10.182910 

770 

35.786400 

86.237663 

10.266010 

20.861300 

575 

10.683280 

42.369698 

16.172670 

9.858121 

775 

35.786400 

86.237663 

10.266010 

20.861300 


TAIL! 0.11 

Macbeth chart spectra for dark skin, light skin, blue sky, and foliage. 



1 1 86 


G REFERENCE DATA 


A 

Blue flower 

Bluish green 

Orange 

Purplish blue 

A 

Blue flower 

Bluish green 

Orange 

Purplish blue 

380 

21.504471 

18.686119 

6.143748 

15.799930 

580 

23.182091 

30.971640 

51.863571 

8.056614 

385 

26.985552 

23.105488 

5.192119 

19.774714 

585 

23.787580 

28.732615 

53.595318 

8.098731 

390 

34.021549 

27.136400 

4.867970 

24.081640 

590 

24.193159 

26.890051 

54.809231 

8.031386 

395 

39.510910 

30.388052 

5.092529 

27.756655 

595 

24.678110 

25.456657 

55.630657 

7.925573 

400 

43.429729 

32.360771 

4.717562 

30.455971 

600 

25.161699 

24.410139 

56.426868 

7.856916 

405 

44.911545 

33.244198 

4.663087 

32.119804 

605 

25.263716 

23.628857 

57.159523 

7.819657 

410 

45.581810 

33.674278 

4.455331 

32.956371 

610 

24.993191 

23.123470 

57.816502 

7.861434 

415 

45.900398 

34.219761 

4.562958 

34.028576 

615 

24.918528 

22.733202 

58.437481 

7.986660 

420 

46.183998 

35.044319 

4.517197 

35.343910 

620 

25.269529 

22.528919 

59.110668 

8.230119 

425 

46.186264 

36.028893 

4.536289 

36.495560 

625 

26.073601 

22.263077 

59.586777 

8.618429 

430 

45.726070 

36.568611 

4.454180 

37.343262 

630 

27.443090 

21.966181 

59.988918 

9.067121 

435 

45.079132 

37.241230 

4.543101 

38.061672 

635 

29.665966 

21.737934 

60.345753 

9.595715 

440 

44.509541 

38.326118 

4.491708 

38.233730 

640 

32.639080 

21.489401 

60.528080 

10.130000 

445 

44.257874 

40.065491 

4.528984 

38.131840 

645 

36.210133 

21.407528 

60.780136 

10.664613 

450 

43.903500 

42.215569 

4.598219 

37.376381 

650 

40.103359 

21.544180 

60.880470 

11.057190 

455 

43.277981 

44.890938 

4.663068 

36.047756 

655 

43.931278 

21.870035 

61.062321 

11.314054 

460 

42.230179 

47.560928 

4.753312 

34.203388 

660 

47.289768 

22.310350 

61.085751 

11.284740 

465 

40.963223 

50.571754 

4.876582 

32.064915 

665 

49.895874 

22.888117 

61.250366 

11.057791 

470 

39.665791 

53.662781 

4.960935 

29.636610 

670 

52.047100 

23.528099 

61.662369 

10.776800 

475 

38.928234 

56.587734 

5.130267 

27.056273 

675 

53.572105 

24.078556 

61.931293 

10.465913 

480 

38.122768 

58.940788 

5.313134 

24.475210 

680 

54.650139 

24.552179 

62.382530 

10.244980 

485 

37.096794 

60.270790 

5.427264 

22.069368 

685 

55.462223 

25.144257 

62.865440 

10.170987 

490 

35.917110 

60.675900 

5.687261 

19.964800 

690 

56.214321 

25.597771 

63.395748 

10.256740 

495 

34.488316 

60.404842 

6.020061 

18.108223 

695 

56.626854 

25.747564 

63.531815 

10.416534 

500 

32.753262 

59.725731 

6.536922 

16.407801 

700 

57.083649 

25.676689 

63.722641 

10.619700 

505 

30.611862 

59.037102 

7.313779 

14.902749 

705 

57.652874 

25.431557 

64.181450 

11.001964 

510 

28.324471 

58.389900 

8.567378 

13.650830 

710 

57.917110 

24.948210 

64.155388 

11.347610 

515 

26.148941 

57.590084 

10.213456 

12.728430 

715 

58.315697 

24.583284 

64.506691 

11.803800 

520 

24.276060 

56.700130 

12.347440 

12.053160 

720 

58.571461 

24.529430 

64.600601 

12.386420 

525 

22.837721 

55.223652 

14.861832 

11.448863 

725 

58.571461 

24.529430 

64.600601 

12.386420 

530 

21.911461 

53.463741 

17.522091 

10.954790 

730 

58.571461 

24.529430 

64.600601 

12.386420 

535 

21.528542 

51.143139 

20.319046 

10.455321 

735 

58.571461 

24.529430 

64.600601 

12.386420 

540 

21.561150 

48.929329 

23.213921 

9.924716 

740 

58.571461 

24.529430 

64.600601 

12.386420 

545 

21.424437 

46.547081 

26.351696 

9.342658 

745 

58.571461 

24.529430 

64.600601 

12.386420 

550 

20.754511 

44.271980 

30.008520 

8.761171 

750 

58.571461 

24.529430 

64.600601 

12.386420 

555 

20.276314 

42.037167 

34.099560 

8.321023 

755 

58.571461 

24.529430 

64.600601 

12.386420 

560 

20.258760 

39.957352 

38.503441 

8.052579 

760 

58.571461 

24.529430 

64.600601 

12.386420 

565 

20.681358 

37.846802 

42.740505 

7.949393 

765 

58.571461 

24.529430 

64.600601 

12.386420 

570 

21.446180 

35.616039 

46.404800 

7.928015 

770 

58.571461 

24.529430 

64.600601 

12.386420 

575 

22.377563 

33.342571 

49.572742 

7.994457 

775 

58.571461 

24.529430 

64.600601 

12.386420 


TARLI 0.12 

Macbeth chart spectra for blue flower, bluish green, orange, and purplish blue. 



A 

Moderate 

red 

Purple 

Yellow- 

green 

Orange- 

yellow 

A 

Moderate 

red 

Purple 

Yellow- 

green 

Orange- 

yellow 

380 

12.669950 

11.859360 

5.699953 

6.128503 

580 

28.888000 

4.242651 

41.317768 

60.767220 

385 

14.544419 

14.397915 

5.295924 

5.963221 

585 

35.891396 

4.410142 

39.298252 

61.392555 

390 

15.348420 

16.806971 

5.190561 

5.502130 

590 

42.222599 

4.726547 

37.397831 

61.844471 

395 

15.663042 

18.538034 

5.193296 

5.262633 

595 

47.389523 

5.251801 

35.887939 

62.124138 

400 

15.440190 

19.507759 

5.253389 

5.069424 

600 

51.272831 

6.055812 

34.859692 

62.476151 

405 

14.938562 

19.613047 

5.187729 

5.262915 

605 

54.253593 

7.160420 

34.170391 

62.988613 

410 

14.476950 

19.221670 

5.058478 

5.094549 

610 

56.294781 

8.484931 

33.650341 

63.419258 

415 

14.282184 

18.745989 

5.258210 

5.165659 

615 

57.641983 

9.963583 

33.258446 

63.862713 

420 

14.062970 

17.924740 

5.420898 

5.269699 

620 

58.599480 

11.448830 

33.054859 

64.243538 

425 

14.033772 

16.916019 

5.601523 

5.297654 

625 

59.279758 

12.870117 

32.866116 

64.701447 

430 

13.942360 

15.635920 

5.927870 

5.274950 

630 

59.545589 

14.237090 

32.563110 

64.920761 

435 

13.707626 

14.395824 

6.187038 

5.308822 

635 

59.743793 

15.569023 

32.343559 

65.222870 

440 

13.486230 

13.231260 

6.503921 

5.357237 

640 

59.785160 

16.828690 

32.089470 

65.231102 

445 

13.459547 

12.166025 

7.160728 

5.576923 

645 

59.796532 

18.113380 

32.033154 

65.480751 

450 

13.427400 

11.141210 

7.912244 

5.758418 

650 

59.745998 

19.487631 

32.239601 

65.551781 

455 

13.396607 

10.160862 

8.914824 

6.232115 

655 

59.640450 

20.996145 

32.673386 

65.568802 

460 

13.266400 

9.231147 

10.078100 

6.698434 

660 

59.723228 

22.531851 

33.203732 

65.598961 

465 

12.960032 

8.429399 

11.531056 

7.507004 

665 

59.561363 

24.294062 

33.880280 

65.700867 

470 

12.515480 

7.772344 

13.340230 

8.380088 

670 

59.904331 

26.318970 

34.743710 

66.116653 

475 

11.967863 

7.196618 

15.777214 

9.273271 

675 

60.057800 

28.436649 

35.445564 

66.352913 

480 

11.482440 

6.696550 

18.759899 

10.123820 

680 

60.308899 

30.740080 

36.263580 

66.685081 

485 

11.087990 

6.231675 

22.517645 

10.892391 

685 

60.629135 

33.170681 

36.929043 

67.072021 

490 

10.811620 

5.775795 

27.220400 

11.648160 

690 

60.795078 

35.678871 

37.536949 

67.572678 

495 

10.615905 

5.434763 

32.417606 

12.608733 

695 

60.868801 

38.061874 

37.830456 

67.729622 

500 

10.328200 

5.174724 

37.850819 

13.805800 

700 

60.864300 

40.371368 

37.764172 

67.714233 

Jos 

10.051079 

4.984176 

42.984535 

15.570681 

705 

61.017906 

42.764141 

37.608898 

68.124062 

510 

9.664922 

4.856252 

47.265862 

18.120770 

710 

60.996571 

44.761829 

37.083248 

68.049828 

515 

9.408811 

4.718699 

50.485565 

21.505651 

715 

61.058018 

46.829060 

36.871063 

68.218697 

520 

9.321096 

4.556475 

52.618839 

25.682369 

720 

60.921082 

48.540401 

36.803909 

68.387672 

525 

9.370760 

4.379337 

53.592617 

30.245390 

725 

60.921082 

48.540401 

36.803909 

68.387672 

530 

9.588202 

4.263306 

53.714550 

34.985229 

730 

60.921082 

48.540401 

36.803909 

68.387672 

535 

9.958136 

4.217373 

53.179836 

39.500118 

735 

60.921082 

48.540401 

36.803909 

68.387672 

540 

10.435310 

4.230328 

52.391361 

43.667969 

740 

60.921082 

48.540401 

36.803909 

68.387672 

545 

10.874439 

4.287500 

51.413731 

47.134743 

745 

60.921082 

48.540401 

36.803909 

68.387672 

550 

11.133060 

4.372940 

50.334900 

50.152199 

750 

60.921082 

48.540401 

36.803909 

68.387672 

555 

11.316847 

4.395883 

49.233387 

52.729702 

755 

60.921082 

48.540401 

36.803909 

68.387672 

560 

11.933180 

4.375529 

48.121429 

55.035461 

760 

60.921082 

48.540401 

36.803909 

68.387672 

565 

13.563336 

4.314129 

46.781601 

57.034595 

765 

60.921082 

48.540401 

36.803909 

68.387672 

570 

16.881001 

4.251691 

45.183651 

58.611809 

770 

60.921082 

48.540401 

36.803909 

68.387672 

575 

22.200445 

4.206913 

43.403061 

59.931717 

775 

60.921082 

48.540401 

36.803909 

68.387672 


T A1 LI 0.13 

Macbeth chart spectra for moderate red, purple, yellow-green, and orange-yellow. 




1 188 


G REFERENCE DATA 


A 

Blue 

Green 

Red 

Yellow 

A 

Blue 

Green 

Red 

Yellow 

380 

10.439910 

5.613400 

5.607345 

5.414068 

580 

3.787635 

16.728580 

11.225140 

74.554909 

385 

12.886761 

5.922082 

5.725149 

5.137067 

585 

3.776771 

14.735263 

15.120548 

75.039360 

390 

15.300090 

6.200247 

5.218394 

4.704292 

590 

3.750088 

12.870290 

20.188690 

75.159851 

395 

17.832754 

6.012131 

4.956743 

4.520826 

595 

3.722865 

11.336102 

26.293297 

75.268250 

400 

19.873070 

6.146746 

5.121036 

4.344138 

600 

3.744416 

10.124970 

33.403919 

75.418297 

405 

21.445684 

6.098392 

4.849411 

4.163425 

605 

3.774995 

9.289590 

41.325912 

75.863037 

410 

22.557791 

6.206563 

4.834329 

4.315780 

610 

3.831947 

8.658465 

48.992439 

76.238503 

415 

24.002087 

6.295394 

4.646286 

4.125665 

615 

3.899615 

8.251087 

55.669144 

76.591118 

420 

25.866680 

6.484003 

4.741153 

4.300586 

620 

3.970753 

7.988280 

60.940609 

77.154846 

425 

27.594967 

6.663407 

4.744888 

4.335837 

625 

4.070755 

7.791666 

64.768387 

77.618408 

430 

29.415440 

6.859391 

4.712543 

4.441292 

630 

4.181859 

7.609302 

67.367477 

77.834442 

435 

31.260592 

7.121455 

4.758327 

4.448205 

635 

4.297741 

7.450396 

69.261131 

78.160141 

440 

32.378559 

7.413484 

4.605003 

4.512076 

640 

4.410157 

7.311813 

70.503029 

78.248161 

445 

32.824814 

7.875336 

4.654145 

4.661452 

645 

4.532931 

7.222866 

71.407242 

78.407303 

450 

32.212132 

8.412111 

4.639721 

4.968147 

650 

4.670670 

7.161132 

71.985252 

78.438698 

455 

30.671518 

9.119376 

4.563574 

5.431167 

655 

4.774706 

7.167995 

72.515152 

78.558060 

460 

28.334240 

9.851811 

4.515702 

6.415275 

660 

4.850951 

7.324787 

72.721283 

78.497429 

465 

25.484060 

10.781063 

4.380999 

7.625481 

665 

4.924014 

7.446044 

73.152130 

78.523407 

470 

22.396629 

11.853670 

4.277033 

9.682865 

670 

4.990421 

7.688989 

73.842392 

78.883820 

475 

19.292439 

13.102253 

4.211078 

12.573673 

675 

5.052288 

7.914150 

74.320816 

79.112015 

480 

16.411600 

14.572390 

4.135724 

16.342621 

680 

5.141122 

8.186454 

74.898903 

79.429321 

485 

13.933167 

16.115742 

4.103681 

21.097710 

685 

5.236134 

8.430433 

75.451370 

79.794380 

490 

11.879180 

17.952681 

4.063927 

26.350420 

690 

5.436373 

8.752160 

76.021973 

80.120506 

495 

10.174302 

20.263348 

3.998882 

31.775793 

695 

5.612031 

8.910556 

76.308281 

80.328354 

500 

8.797017 

23.403231 

3.906034 

37.215130 

700 

5.848758 

8.969923 

76.621246 

80.466164 

505 

7.712556 

27.286863 

3.941102 

42.412758 

705 

6.181974 

9.050636 

77.229218 

81.044724 

510 

6.886145 

31.145849 

3.924661 

47.316730 

710 

6.504695 

8.918600 

77.277542 

80.947998 

515 

6.304307 

34.160007 

3.978011 

51.712082 

715 

6.921960 

8.914872 

77.749832 

81.159058 

520 

5.817560 

35.976002 

4.021000 

55.590401 

720 

7.462452 

8.840295 

77.937653 

81.175743 

525 

5.407475 

36.454826 

4.037704 

58.489658 

725 

7.462452 

8.840295 

77.937653 

81.175743 

530 

5.059405 

35.995239 

4.088925 

60.763920 

730 

7.462452 

8.840295 

77.937653 

81.175743 

535 

4.751938 

34.788082 

4.060666 

62.378773 

735 

7.462452 

8.840295 

77.937653 

81.175743 

540 

4.495099 

33.215500 

4.200492 

63.929630 

740 

7.462452 

8.840295 

77.937653 

81.175743 

545 

4.260099 

31.219875 

4.266126 

65.257240 

745 

7.462452 

8.840295 

77.937653 

81.175743 

550 

4.095733 

29.161699 

4.498791 

66.725067 

750 

7.462452 

8.840295 

77.937653 

81.175743 

555 

3.963588 

27.074284 

4.790018 

68.364166 

755 

7.462452 

8.840295 

77.937653 

81.175743 

560 

3.891369 

25.007160 

5.160057 

70.174118 

760 

7.462452 

8.840295 

77.937653 

81.175743 

565 

3.841832 

22.942715 

5.749929 

71.759537 

765 

7.462452 

8.840295 

77.937653 

81.175743 

570 

3.820277 

20.921579 

6.728241 

73.110298 

770 

7.462452 

8.840295 

77.937653 

81.175743 

575 

3.801960 

18.823641 

8.466923 

74.078796 

775 

7.462452 

8.840295 

77.937653 

81.175743 


TARLI 0.14 

Macbeth chart spectra for blue, green, red, and yellow. 
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A 

Magenta 

Cyan 

White 

Neutral .8 

A 

Magenta 

Cyan 

White 

Neutral .8 

380 

18.022249 

13.999790 

25.375759 

22.696239 

580 

21.613300 

8.781203 

92.538689 

60.987720 

385 

24.141811 

17.662146 

32.028969 

29.054020 

585 

26.161331 

8.340083 

92.293861 

60.901627 

390 

29.218670 

20.940290 

41.140018 

37.571621 

590 

30.903650 

7.908925 

91.910744 

60.653358 

395 

33.815952 

23.452644 

53.732193 

46.166943 

595 

35.840412 

7.572946 

91.650513 

60.300995 

400 

35.977070 

25.668360 

66.863197 

53.429451 

600 

41.036240 

7.351048 

91.441002 

60.132450 

405 

36.876205 

26.630613 

76.693954 

57.110985 

605 

46.447910 

7.223466 

91.463058 

60.118248 

410 

36.754471 

27.105560 

81.899612 

58.401501 

610 

51.659920 

7.129447 

91.623558 

60.215561 

415 

36.671719 

28.486927 

85.311897 

59.024353 

615 

56.339233 

7.113521 

91.761253 

60.238060 

420 

35.898029 

29.540880 

86.954269 

59.528542 

620 

60.391499 

7.105100 

92.056503 

60.303070 

425 

35.148922 

30.971483 

88.338806 

59.993679 

625 

63.846664 

7.103106 

92.346115 

60.339325 

430 

33.675171 

32.553398 

88.422600 

59.915970 

630 

66.495522 

7.099453 

92.366142 

60.116680 

435 

32.182053 

34.028538 

88.229996 

59.697506 

635 

68.510284 

7.152534 

92.175926 

60.058334 

440 

30.604540 

35.484280 

88.368492 

59.728500 

640 

70.003822 

7.252600 

92.107048 

59.853298 

445 

29.404226 

37.367111 

89.407570 

60.187801 

645 

71.087265 

7.349627 

91.950233 

59.600319 

450 

28.225990 

39.260792 

90.160072 

60.356602 

650 

71.911079 

7.430556 

91.762352 

59.369801 

455 

26.886612 

41.293606 

90.400452 

60.453487 

655 

72.552330 

7.536674 

91.552132 

59.176750 

460 

25.465940 

42.623539 

90.278893 

60.163570 

660 

72.939537 

7.535679 

91.366997 

58.879410 

465 

23.959021 

43.587276 

89.992531 

59.868294 

665 

73.311600 

7.537577 

91.155579 

58.789925 

470 

22.230909 

44.182098 

90.135948 

59.730091 

670 

74.038681 

7.492112 

91.727760 

58.988022 

475 

20.828238 

44.616940 

90.474327 

60.002926 

675 

74.580025 

7.411035 

91.787041 

58.964962 

480 

19.569361 

44.645649 

91.306183 

60.322170 

680 

74.964828 

7.313859 

92.107681 

59.136768 

485 

18.467690 

43.998219 

91.767899 

60.546734 

685 

75.547630 

7.236807 

92.397926 

59.311607 

490 

17.499559 

42.721741 

91.626801 

60.466930 

690 

76.021133 

7.118407 

92.761703 

59.392921 

495 

16.547125 

41.007633 

91.293587 

60.220291 

695 

76.412292 

7.020689 

92.583092 

59.202972 

500 

15.474920 

38.825981 

90.850433 

59.951031 

700 

76.642761 

6.981132 

92.438179 

59.085461 

505 

14.258726 

36.596691 

90.837151 

59.914757 

705 

77.305191 

7.056737 

92.927948 

59.169731 

510 

12.908480 

34.225380 

91.037247 

60.110161 

710 

77.348541 

7.200351 

92.690323 

58.877949 

515 

11.735996 

31.868864 

91.565384 

60.493999 

715 

77.644272 

7.454174 

92.630486 

58.874104 

520 

10.841580 

29.297951 

91.988602 

60.780670 

720 

77.633263 

7.847146 

92.432678 

58.492451 

525 

10.336895 

26.658068 

92.111366 

60.855427 

725 

77.633263 

7.847146 

92.432678 

58.492451 

530 

10.238720 

24.025311 

91.802696 

60.751438 

730 

77.633263 

7.847146 

92.432678 

58.492451 

535 

10.372057 

21.356544 

91.324654 

60.423901 

735 

77.633263 

7.847146 

92.432678 

58.492451 

540 

10.613430 

18.867420 

91.041283 

60.229000 

740 

77.633263 

7.847146 

92.432678 

58.492451 

545 

10.653950 

16.573082 

90.819458 

60.184757 

745 

77.633263 

7.847146 

92.432678 

58.492451 

550 

10.483800 

14.551340 

90.852364 

60.230389 

750 

77.633263 

7.847146 

92.432678 

58.492451 

555 

10.505778 

12.926720 

91.282684 

60.397083 

755 

77.633263 

7.847146 

92.432678 

58.492451 

560 

11.048930 

11.628690 

91.900436 

60.725960 

760 

77.633263 

7.847146 

92.432678 

58.492451 

565 

12.309056 

10.648726 

192.308823 

60.882137 

765 

77.633263 

7.847146 

92.432678 

58.492451 

570 

14.462500 

9.884577 

92.568031 

61.023682 

770 

77.633263 

7.847146 

92.432678 

58.492451 

575 

17.684841 

9.296182 

92.796066 

61.169136 

775 

77.633263 

7.847146 

92.432678 

58.492451 


TAIL! O.IS 

Macbeth chart spectra for magenta, cyan, white, and neutral .8. 




TARLI 0.10 

Macbeth chart spectra for neutral 6.5, neutral .5, neutral 3.5, and black. 
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0.6 IwriObiMb 

Tables G.17 through G.26 present spectral information for forty real objects. The 
data in these tables was interpolated from measurements in the range [390,730] 
nanometers at 2-nanometer steps, by Vrhel, Iwan and Gershon [457]. Their work 
was supported by NCSU and Eastman Kodak. As of publication, the original data, 
plus data for the Munsell color chips and 130 other real objects is available via anony¬ 
mous ftp from the server ftp.eos.ncsu.edu in the directory /pub/spectra. 
These spectra are plotted in Figures G.16 through G.25. 



A 

Pine 

needles 

Silver 

maple 

leaf 

Dark 

green 

maple 

leaf 

Red 

maple 

leaf 

A 

Pine 

needles 

Silver 

maple 

leaf 

Dark 

green 

maple 

leaf 

Red 

maple 

leaf 

380 

0.029600 

0.039800 

0.036300 

0.018600 

580 

0.085100 

0.071300 

0.156900 

0.054100 

385 

0.029600 

0.039800 

0.036300 

0.018600 

585 

0.079141 

0.068097 

0.146819 

0.049989 

390 

0.029600 

0.039800 

0.036300 

0.018600 

590 

0.075200 

0.066100 

0.139500 

0.047400 

395 

0.028803 

0.040595 

0.035490 

0.018680 

595 

0.072491 

0.065321 

0.134735 

0.045560 

400 

0.027800 

0.041500 

0.034200 

0.018300 

600 

0.070700 

0.064500 

0.131200 

0.044200 

405 

0.028613 

0.040981 

0.033669 

0.018856 

605 

0.068064 

0.063573 

0.126421 

0.042232 

410 

0.029200 

0.041400 

0.033600 

0.019200 

610 

0.063900 

0.061600 

0.118800 

0.039200 

415 

0.029513 

0.041243 

0.033055 

0.019189 

615 

0.059599 

0.059510 

0.110514 

0.036151 

420 

0.030600 

0.041600 

0.034000 

0.019700 

620 

0.056400 

0.058000 

0.104000 

0.034000 

425 

0.031044 

0.041406 

0.034715 

0.019702 

625 

0.054789 

0.057071 

0.100686 

0.032827 

430 

0.031200 

0.041100 

0.035200 

0.019800 

630 

0.054300 

0.057000 

0.099400 

0.032400 

435 

0.031491 

0.041038 

0.035971 

0.019751 

635 

0.052772 

0.056103 

0.095965 

0.031496 

440 

0.031700 

0.041000 

0.036500 

0.019700 

640 

0.049300 

0.054000 

0.087200 

0.029100 

445 

0.031994 

0.041121 

0.037276 

0.019789 

645 

0.045315 

0.051874 

0.077085 

0.026255 

450 

0.032600 

0.041500 

0.038400 

0.019900 

650 

0.042600 

0.050300 

0.069700 

0.024400 

455 

0.032871 

0.041761 

0.039155 

0.020063 

655 

0.041167 

0.049755 

0.066340 

0.023362 

460 

0.033100 

0.041700 

0.039800 

0.020000 

660 

0.039400 

0.049000 

0.061900 

0.022200 

465 

0.033347 

0.043096 

0.040400 

0.020038 

665 

0.037525 

0.047818 

0.056503 

0.021029 

470 

0.033400 

0.043500 

0.041000 

0.020100 

670 

0.036800 

0.047600 

0.053800 

0.020700 

475 

0.034198 

0.042944 

0.041977 

0.019992 

675 

0.036856 

0.047899 

0.053596 

0.021133 

480 

0.034600 

0.043200 

0.042500 

0.020100 

680 

0.038000 

0.048800 

0.055600 

0.022100 

485 

0.035069 

0.043571 

0.043554 

0.020086 

685 

0.040924 

0.051119 

0.062490 

0.023933 

490 

0.035500 

0.044000 

0.044900 

0.020300 

690 

0.049700 

0.056600 

0.085700 

0.030500 

495 

0.036887 

0.044616 

0.046992 

0.021790 

695 

0.076329 

0.073143 

0.141592 

0.049319 

500 

0.038800 

0.045200 

0.051800 

0.022800 

700 

0.127300 

0.110600 

0.224300 

0.083900 

505 

0.042594 

0.046846 

0.060185 

0.025527 

705 

0.191700 

0.164531 

0.312366 

0.129966 

510 

0.049000 

0.050000 

0.073900 

0.030700 

710 

0.256600 

0.224500 

0.391500 

0.183600 

515 

0.059917 

0.055638 

0.094850 

0.039821 

715 

0.319407 

0.286062 

0.459499 

0.242201 

520 

0.076500 

0.064400 

0.124100 

0.052200 

720 

0.378800 

0.347500 

0.517100 

0.304300 

525 

0.095189 

0.074645 

0.155038 

0.065529 

725 

0.433914 

0.407055 

0.563011 

0.364725 

530 

0.111300 

0.083900 

0.182400 

0.075800 

730 

0.479800 

0.460100 

0.596300 

0.419800 

535 

0.121705 

0.089734 

0.201339 

0.081755 

735 

0.479800 

0.460100 

0.596300 

0.419800 

540 

0.127200 

0.092700 

0.212400 

0.084700 

740 

0.479800 

0.460100 

0.596300 

0.419800 

545 

0.130728 

0.094851 

0.219945 

0.086575 

745 

0.479800 

0.460100 

0.596300 

0.419800 

550 

0.133200 

0.096500 

0.225500 

0.088100 

750 

0.479800 

0.460100 

0.596300 

0.419800 

555 

0.133128 

0.097030 

0.226116 

0.087837 

755 

0.479800 

0.460100 

0.596300 

0.419800 

560 

0.128000 

0.094400 

0.219800 

0.084100 

760 

0.479800 

0.460100 

0.596300 

0.419800 

565 

0.118589 

0.089177 

0.207223 

0.077073 

765 

0.479800 

0.460100 

0.596300 

0.419800 

570 

0.105800 

0.082100 

0.189400 

0.068300 

770 

0.479800 

0.460100 

0.596300 

0.419800 

575 

0.093736 

0.075570 

0.171017 

0.059939 

775 

0.479800 

0.460100 

0.596300 

0.419800 


TABLI 0.17 

Spectra for pine needles and maple leaves (silver, dark green, and red). 
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MOURI 0*1# 

Spectral plots for pine needles and maple leaves (silver, dark green, and red). 



MOURI 0.17 

Spectral plots for grass, soil, vine leaf, and asphalt. 





A 

Grass 

Soil 

Vine 

leaf 

Asphalt 

A 

Grass 

Soil 

Vine 

leaf 

Asphalt 

380 

0.018800 

0.030900 

0.022000 

0.068400 

580 

0.033800 

0.103300 

0.052600 

0.064900 

385 

0.018800 

0.030900 

0.022000 

0.068400 

585 

0.034492 

0.096441 

0.048718 

0.063282 

390 

0.018800 

0.030900 

0.022000 

0.068400 

590 

0.035200 

0.092100 

0.046100 

0.061700 

395 

0.018802 

0.030570 

0.023131 

0.080352 

595 

0.036003 

0.090051 

0.044805 

0.060297 

400 

0.019000 

0.029700 

0.022900 

0.090700 

600 

0.036800 

0.088600 

0.043800 

0.059100 

405 

0.019211 

0.029487 

0.022865 

0.099929 

605 

0.037695 

0.084619 

0.041973 

0.058133 

410 

0.019600 

0.029500 

0.023400 

0.110500 

610 

0.038500 

0.077700 

0.039200 

0.057200 

415 

0.019675 

0.029776 

0.023280 

0.128480 

615 

0.039400 

0.070651 

0.036258 

0.056147 

420 

0.020300 

0.030900 

0.023700 

0.133500 

620 

0.040400 

0.065800 

0.034400 

0.055200 

425 

0.020498 

0.031547 

0.024150 

0.135845 

625 

0.041303 

0.063415 

0.033392 

0.054103 

430 

0.020800 

0.032000 

0.024000 

0.133500 

630 

0.042300 

0.062700 

0.033100 

0.053000 

435 

0.021082 

0.032531 

0.024076 

0.130632 

635 

0.043182 

0.061868 

0.032586 

0.051836 

440 

0.021500 

0.033000 

0.024200 

0.128600 

640 

0.044400 

0.058800 

0.031000 

0.051000 

445 

0.021983 

0.033474 

0.024393 

0.125585 

645 

0.045464 

0.054444 

0.029281 

0.050195 

450 

0.022200 

0.034000 

0.024600 

0.122700 

650 

0.046500 

0.050000 

0.027800 

0.049600 

455 

0.022402 

0.034255 

0.024548 

0.119231 

655 

0.047588 

0.046240 

0.027067 

0.049407 

460 

0.022700 

0.034400 

0.024500 

0.116100 

660 

0.048700 

0.041900 

0.026100 

0.049400 

465 

0.023049 

0.034803 

0.024599 

0.113976 

665 

0.049888 

0.038002 

0.025316 

0.050173 

470 

0.023400 

0.035400 

0.025200 

0.110900 

670 

0.051300 

0.036300 

0.025200 

0.051700 

475 

0.023649 

0.035487 

0.025411 

0.108572 

675 

0.052712 

0.036128 

0.025691 

0.054277 

480 

0.023900 

0.035900 

0.025300 

0.105900 

680 

0.054100 

0.037000 

0.026500 

0.058100 

485 

0.024187 

0.036050 

0.025407 

0.103651 

685 

0.055643 

0.039818 

0.028035 

0.063422 

490 

0.024500 

0.036400 

0.025600 

0.102000 

690 

0.057200 

0.052100 

0.032500 

0.071000 

495 

0.024909 

0.037328 

0.025752 

0.099743 

695 

0.058886 

0.085055 

0.047704 

0.081564 

500 

0.025200 

0.039200 

0.026300 

0.098200 

700 

0.060600 

0.137700 

0.081800 

0.097000 

505 

0.025591 

0.043587 

0.027604 

0.095556 

705 

0.062298 

0.195154 

0.126645 

0.118016 

510 

0.026100 

0.051600 

0.030800 

0.092900 

710 

0.063900 

0.250400 

0.175000 

0.146300 

515 

0.026512 

0.065044 

0.036376 

0.090211 

715 

0.065747 

0.302173 

| 0.223237 

0.180895 

520 

0.026900 

0.084400 

0.045900 

0.087100 

720 

0.067700 

0.349500 

j 0.269700 

0.220900 

525 

0.027338 

0.105930 

0.057947 

0.084199 

725 

0.069607 

0.390741 

0.312332 

0.263639 

530 

0.027900 

0.123800 

0.069100 

0.081400 

730 

0.071400 

0.424400 

0.347900 

0.306600 

535 

0.028400 

0.136531 

0.076816 

0.078576 

735 

0.071400 

0.424400 

0.347900 

0.306600 

540 

0.028900 

0.143300 

0.081000 

0.076000 

740 

0.071400 

0.424400 

0.347900 

0.306600 

545 

0.029509 

0.148244 

0.083776 

0.073917 

745 

0.071400 

0.424400 

0.347900 

0.306600 

550 

0.030100 

0.152300 

0.085800 

0.072400 

750 

0.071400 

0.424400 

0.347900 

0.306600 

555 

0.030648 

0.152214 

0.085457 

0.070747 

755 


0.424400 

0.347900 

0.306600 

560 

0.031300 

0.147300 

0.081800 

0.069600 

760 


0.424400 


0.306600 

565 

0.031900 


0.075425 

0.068663 

765 

0.071400 

0.424400 

0.347900 

0.306600 

570 

0.032600 



0.067600 

770 

0.071400 

0.424400 


0.306600 

575 

0.033262 

0.113764 

0.058398 

0.066246 


0.071400 

0.424400 


0.306600 


TAB LI 0.11 

Spectra for grass, soil, vine leaf, and asphalt. 


































A 


Daisy 

yellow 

center 

Marigold 

orange 

Marigold 

yellow 

1 


Daisy 

yellow 

center 

Marigold 

orange 

Marigold 

yellow 

380 

0.028800 

0.025700 

0.035200 

0.058700 

580 

0.391400 

0.411700 

0.503200 

0.214500 

385 

0.028800 

0.025700 

0.035200 

0.058700 

585 

0.395083 

0.414181 

0.508292 

0.251287 

390 

0.028800 

0.025700 

0.035200 

0.058700 

590 

0.399300 

0.417400 

0.513200 

0.293600 

395 

0.022880 

0.018409 

0.026572 

0.058646 

595 

0.403365 

0.421022 

0.517930 

0.340226 

400 

0.017900 

0.013400 

0.021700 

0.068000 

600 

0.406100 

0.423600 

0.521500 

0.386500 

405 

0.016032 

0.012376 

0.016391 

0.085506 

605 

0.408193 

0.423460 

0.524462 

0.426925 

410 

0.014700 

0.010200 

0.014400 

0.111800 

610 

0.407000 

0.420300 

0.527500 

0.460100 

415 

0.013631 

0.008862 

0.011961 

0.139686 

615 

0.404087 

0.416098 

0.528865 

0.484983 

420 

0.014100 

0.008200 

0.011900 

0.169400 

620 

0.401800 

0.411500 

0.530300 

0.502800 

425 

0.013882 

0.008100 

0.012080 

0.194080 

625 

0.401092 

0.409648 

0.532011 

0.514647 

430 

0.014800 

0.008700 

0.012300 

0.210600 

630 

0.402900 

0.410600 

0.534700 

0.524400 

435 

0.015291 

0.008621 

0.014122 

0.225214 

635 

0.403656 

0.411247 

0.536570 

0.530300 

440 

0.015700 

0.008300 

0.016700 

0.227800 

640 

0.403600 

0.411000 

0.537000 

0.534900 

445 

0.016063 

0.008265 

0.019407 

0.228770 

645 

0.400851 

0.408202 

0.536595 

0.537864 

450 

0.016100 

0.008600 

0.022200 

0.228100 

650 

0.394500 

0.400800 

0.535500 

0.539600 

455 

0.016618 

0.009442 

0.024787 

0.224419 

655 

0.387390 

0.389830 

0.534526 

0.540692 

460 

0.016900 

0.010500 

0.028500 

0.220500 

660 

0.374100 

0.369100 

0.533000 

0.541800 

465 

0.017775 

0.011694 

0.033288 

0.216800 

665 

0.354589 

0.339040 

0.527190 

0.541834 

470 

0.018500 

0.011700 

0.037800 

0.210300 

670 

0.338200 

0.312800 

0.523100 

0.544400 

475 

0.018538 

0.011286 

0.038786 

0.203693 

675 

0.329312 

0.298273 

0.521503 

0.544910 

480 

0.018500 

0.011700 

0.037500 

0.195300 

680 

0.329600 

0.296100 

0.524000 

0.545800 

485 

0.018782 

0.013526 

0.038364 

0.186230 

685 

0.349426 

0.321232 

0.535810 

0.547162 

490 

0.019800 

0.018800 

0.043600 

0.175600 

690 

0.389400 

0.373600 

0.553400 

0.548500 

495 

0.022031 

0.029949 

0.054371 

0.165735 

695 

0.433821 

0.424166 

0.568270 

0.549694 

500 

0.027100 

0.049900 

0.074600 

0.156300 

700 

0.471500 

0.460700 

0.579100 

0.550900 

505 

0.037496 

0.081222 

0.100825 

0.146827 

705 

0.500947 

0.483861 

0.584743 

0.550557 

510 

0.058000 

0.120600 

0.136800 

0.138800 

710 

0.523400 

0.499700 

0.590400 

0.550000 

515 

0.092116 

0.167515 

0.179710 

0.131694 

715 

0.542755 

0.510764 

0.593858 

0.549949 

520 

0.137400 

0.214400 

0.225100 

0.125100 

720 

0.557900 

0.519600 

0.597200 

0.548400 

525 

0.183722 

0.257637 

0.268897 

0.120244 

725 

0.570330 

0.524864 

0.598571 

0.547954 

530 

0.226300 

0.294900 

0.309400 

0.117800 

730 

0.579400 

0.528000 

0.598300 

0.545800 

535 

0.261351 

0.324177 

0.345784 

0.115398 

735 

0.579400 

0.528000 

0.598300 

0.545800 

540 

0.291200 

0.347000 

0.376900 

0.115000 

740 

0.579400 

0.528000 

0.598300 

0.545800 

545 

0.314826 

0.363243 

0.403360 

0.115541 

745 

0.579400 

0.528000 

0.598300 

0.545800 

550 

0.333000 

0.376300 

0.424700 

0.118700 

750 

0.579400 

0.528000 

0.598300 

0.545800 

555 

0.347621 

0.386112 

0.439731 

0.123379 

755 

0.579400 

0.528000 

0.598300 

0.545800 

560 

0.359100 

0.393600 

0.453300 

0.131000 

760 

0.579400 

0.528000 

0.598300 

0.545800 

565 

0.370424 

0.400413 

0.467691 

0.142828 

765 

0.579400 

0.528000 

0.598300 

0.545800 

570 

0.380300 

0.405100 

0.484600 

0.159900 

770 

0.579400 

0.528000 

0.598300 

0.545800 

575 

0.386346 

0.408152 

0.494814 

0.183483 

775 

0.579400 

0.528000 

0.598300 

0.545800 


T A1 LI 0.19 

Spectra for daisies (white petals and yellow center) and marigolds (orange and yellow). 
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G REFERENCE 


DATA 



noun o.is 

Spectral plots for daisies (white petals and yellow center) and marigolds (orange and yellow). 



FIOUftR 0*19 

Spectral plots for blue jeans (dark blue and faded), dark blue sweatpants, and denim. 











A 

Dark 

blue 

jeans 

Faded 

jeans 

Dark 

blue 

sweat 

pants 

Denim 

A 

Dark 

blue 

jeans 


Dark 

blue 

sweat 

pants 

Denim 

380 

0.268700 | 

0.161400 

0.062500 

0.028700 

580 

0.455900 

0.107600 

0.081400 

0.032400 

385 

0.268700 

0.161400 

0.062500 

0.028700 

585 

0.456395 

0.109610 

0.111418 

0.031802 

390 

0.268700 

0.161400 

0.062500 

0.028700 

590 

0.458000 

0.116800 

0.157000 

0.031200 

395 

0.286980 

0.226345 

0.059801 

0.029209 

595 

0.458866 

0.129143 

0.215848 

0.030958 

400 

0.297300 

0.298700 

0.057100 

0.028500 

600 

0.460300 

0.145300 

0.283100 

0.030700 

405 

0.306263 

0.369217 

0.056204 

0.028691 

605 

0.461855 

0.163165 

0.351420 

0.030612 

410 

0.318700 

0.423100 

0.057000 

0.029100 

610 

0.463600 : 

0.179400 

0.412000 

0.030600 

415 

0.330186 

0.441091 

0.057408 

0.029055 

615 

0.464554 : 

0.192187 

0.464012 

0.030544 

420 

0.347600 

0.453500 

0.058500 

0.029500 

620 

0.465300 

0.202200 | 

0.502900 

0.030700 

425 

0.353125 

0.453739 

0.057932 

0.029413 

625 

0.466899 

0.209755 | 

0.530510 

0.030841 

430 

0.360700 

0.444900 

0.057900 

0.029600 

630 

0.469100 

0.216300 

0.551300 

0.031200 

435 

0.364644 

0.438727 

0.057738 

0.030913 

635 

0.469645 

0.222692 

0.565688 

0.031441 

440 

0.371600 

0.423900 

0.056900 

0.031700 

640 

0.471400 

0.229800 

0.577000 

0.031900 

445 

0.377375 

0.410957 

0.056796 

0.032036 

645 

0.472649 

0.237935 

0.586149 

0.032659 

450 

0.383700 

0.397400 

0.056300 

0.032600 

650 

0.473600 

0.246300 

0.593100 

0.033300 

455 

0.387128 

0.375020 

0.055206 

0.033467 

655 

0.474737 

0.255862 

0.598725 

0.034343 

460 

0.391700 

0.354700 

0.054800 

0.034400 

660, 

0.476900 

0.266300 

0.604800 

0.035700 

465 

0.397751 

0.335829 

0.054414 

0.036025 

665 

0.478933 

0.277449 

0.608464 

0.037313 

470 

0.402600 

0.315700 

0.053900 

0.037800 

670; 

0.482000 

0.290400 

0.612300 

0.039800 

475 

0.405296 

0.297177 

0.053058 

0.039841 

675 | 

0.484533 

0.303762 

0.615180 

0.042889 

480 

0.409200 

0.277400 

0.052200 

0.042400 

680 | 

0.487600 

0.318800 

0.618200 

0.047300 

485 

0.412220 

0.259495 

0.051811 

0.044814 

685 

0.490451 

0.334302 

0.621256 

0.053033 

490 

0.415400 

0.238300 

0.051000 

0.046800 

690 

0.493900 

0.350400 

0.624600 

0.060200 

495 

0.416431 

0.219584 

0.050541 

0.047834 

695 

0.496021 

0.367197 

0.627899 

0.069335 

500 

0.420100 

0.202100 

0.050200 

0.048000 

700 

0.498600 

0.384300 

0.630800 

0.079700 

505 

0.423178 

0.188678 

0.049578 

0.047626 

705 

0.500664 

0.398953 

0.632980 

0.091378 

510 

0.425300 

0.178900 

0.048900 

0.047100 

710 

0.501300 

0.414200 

0.634600 

0.103700 

515 

0.427807 

0.169443 

0.048345 

0.046351 

715 

0.504693 

0.428654 

0.636618 

0.116187 

520 

0.429800 

0.158900 

0.048800 

0.045600 

720 

0.505300 

0.441800 

0.638500 

0.129000 

525 

0.432536 

0.147388 

0.048235 

0.044664 

725 

0.507215 

0.453581 

0.639105 

0.141529 

530 

0.435000 

0.136000 

0.048800 

0.043600 

730 

0.507200 

0.461700 

0.640600 

0.153600 

535 

0.437551 

0.126196 

0.048795 

0.042500 

735 

0.507200 

0.461700 

0.640600 

0.153600 

540 

0.439800 

0.119700 

0.048800 

0.041200 

740 

0.507200 

0.461700 

0.640600 

0.153600 

545 

0.441737 

0.117381 

0.049468 

0.040115 

745 

0.507200 

0.461700 

0.640600 

0.153600 

550 

0.443700 

0.117500 

0.049400 

0.038800 

750 

0.507200 

0.461700 

0.640600 

; 0.153600 

555 

0.444670 

0.118509 

0.050465 

0.037499 

755 

0.507200 

0.461700 

0.640600 

0.153600 

560 

0.446600 

0.118900 

0.050900 

0.036300 

760 

0.507200 

0.461700 

0.640600 

0.153600 

565 

0.449170 

0.116871 

0.052435 

0.035093 

765 

0.507200 

0.461700 

0.640600 

0.153600 

570 

0.451000 

0.113300 

0.056000 

0.034100 

770 

0.507200 

0.461700 

0.640600 

0.153600 

575 

0.453509 

0.109049 

i 

0.063744 

0.033143 

775 

0.507200 

0.461700 

0.640600 

0.153600 


TAt LB B.20 

Spectra for blue jeans (dark blue and faded), dark blue sweatpants, and denim. 







TABU «.11 

Spectra for wheat bread (bread and crust), pancake, and Swiss army knife. 







































































































































































0.6 lt«l Objects 
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PI9U1I 0.20 

Spectral plots for wheat bread (bread and crust), pancake, and Swiss army knife. 



FIOURI 0.21 

Spectral plots for wood (pine, maple, and oak) and bamboo. 







A 

Pine 

wood 

Maple 

wood 

Oak 

wood 

Bamboo 

A 

Pine 

wood 

Maple 

wood 

Oak 

wood 

Bamboo 

380 

0.064500 

0.084100 

0.057300 

0.071900 

580 

0.337600 

0.362500 

0.385500 

0.479000 

385 

0.064500 

0.084100 

0.057300 

0.071900 

585 

0.341304 

0.370425 

0.396347 

0.490194 

390 

0.064500 

0.084100 

0.057300 

0.071900 

590 

0.346200 

0.378300 

0.407400 

0.503000 

395 

0.068718 

0.090484 

0.057437 

0.072583 

595 

0.350895 

0.386917 

0.419605 

0.515244 

400 

0.073700 

0.094000 

0.055600 

0.073200 

600 

0.356100 

0.394900 

0.430800 

0.527100 

405 

0.082748 

0.100325 

0.057740 

0.079685 

605 

0.361581 

0.403158 

0.442333 

0.539090 

410 

0.095200 

0.108300 

0.061700 

0.088400 

610 

0.367800 

0.410700 

0.453300 

0.550700 

415 

0.104713 

0.116287 

0.068096 

0.096282 

615 

0.373057 

0.417947 

0.464678 

0.561204 

420 

0.116600 

0.124200 

0.075500 

0.106400 

620 

0.378800 

0.425400 

0.475200 

0.570900 

425 

0.126632 

0.132052 

0.083912 

0.115930 

625 

0.384713 

0.432113 

0.485331 

0.580904 

430 

0.137400 

0.140400 

0.093600 

0.125800 

630 

0.392000 

0.439400 

0.496900 

0.590500 

435 

0.145631 

0.146844 

0.102628 

0.133918 

635 

0.397608 

0.446451 

0.507248 

0.600316 

440 

0.156400 

0.158600 

0.112200 

0.142900 

640 

0.404100 

0.453700 

0.517700 

0.609800 

445 

0.164081 

0.163063 

0.121981 

0.152011 

645 

0.410208 

0.460383 

0.528485 

0.618494 

450 

0.173000 

0.171000 

0.131300 

0.166200 

650 

0.415000 

0.466200 

0.538000 

0.626200 

455 

0.181318 

0.176307 

0.141859 

0.174713 

655 

0.419782 

0.472380 

0.547200 

0.633200 

460 

0.188600 

0.183300 

0.151000 

0.184500 

660 

0.425800 

0.478500 

0.557300 

0.640400 

465 

0.198000 

0.188929 

0.161759 

0.195434 

665 

0.431287 

0.484360 

0.567096 

0.648115 

470 

0.207100 

0.195600 

0.170400 

0.205600 

670 

0.437500 

0.491200 

0.577800 

0.656700 

475 

0.215280 

0.201647 

0.178030 

0.217829 

675 

0.442778 

0.497676 

0.587393 

0.663958 

480 

0.223300 

0.207500 

0.186800 

0.227000 

680 

0.447500 

0.503400 

0.598700 

0.671100 

485 

0.232423 

0.213873 

0.195861 

0.238774 

685 

0.451066 

0.509587 

0.607887 

0.677915 

490 

0.240200 

0.220300 

0.204300 

0.250300 

690 

0.454500 

0.515700 

0.618400 

0.685800 

495 

0.248112 

0.226527 

0.212957 

0.260833 

695 

0.458159 

0.521529 

0.627128 

0.690574 

500 

0.255800 

0.232400 

0.222500 

0.272000 

700 

0.463600 

0.526600 

0.636700 

0.697800 

505 

0.263335 

0.239443 

0.230962 

0.283801 

705 

0.468595 

0.531872 

0.645513 

0.701852 

510 

0.269700 

0.246500 

0.239600 

0.296500 

710 

0.473200 

0.537000 

0.653700 

0.706800 

515 

0.277259 

0.253803 

0.249750 

0.308582 

715 

0.477929 

0.541029 

0.661581 

0.712460 

520 

0.283000 

0.261800 

0.259300 

0.321300 

720 

0.482300 

0.546500 

0.668400 

0.716200 

525 

0.288381 

0.269478 

0.269434 

0.333680 

725 

0.486528 

0.550386 

0.676647 

0.720197 

530 

0.294700 

0.277000 

0.278500 

0.346900 

730 

0.488800 

0.553200 

0.682100 

0.722200 

535 

0.299307 

0.285165 

0.287866 

0.359636 

735 

0.488800 

0.553200 

0.682100 

0.722200 

540 

0.304100 

0.293400 

0.298300 

0.373500 

740 

0.488800 

0.553200 

0.682100 

0.722200 

545 

0.308641 

0.302089 

0.308739 

0.385863 

745 

0.488800 

0.553200 

0.682100 

0.722200 

550 

0.313100 

0.311000 

0.319000 

0.399400 

750 

0.488800 

0.553200 

0.682100 

0.722200 

555 

0.317112 

0.319455 

0.329613 

0.412664 

755 

0.488800 

0.553200 

0.682100 

0.722200 

560 

0.320900 

0.328300 

0.340400 

0.425900 

760 

0.488800 

0.553200 

0.682100 

0.722200 

565 

0.325643 

0.337026 

0.352314 

0.439588 

765 

0.488800 

0.553200 

0.682100 

0.722200 

570 

0.329800 

0.346600 

0.363600 

0.453300 

770 

0.488800 

0.553200 

0.682100 

0.722200 

575 

0.333401 

0.354253 

0.374349 

0.465823 

775 

0.488800 

0.553200 

0.682100 

0.722200 


TABLI t.22 

Spectra for wood (pine, maple, and oak) and bamboo. 




A 

Redwood 

Walnut 

wood 

Yellow 

banana 

Ripe 

brown 

banana 

A 

Redwood 

Walnut 

wood 

Yellow 

banana 

Ripe 

brown 

banana 

380 

0.061200 

0.069900 

0.143300 

0.168700 

580 

0.176800 

0.141500 

0.435800 

0.643700 

385 

0.061200 

0.069900 

0.143300 

0.168700 

585 

0.181940 

0.143904 

0.443349 

0.624895 

390 

0.061200 

0.069900 

0.143300 

0.168700 

590 

0.187300 

0.146100 

0.451700 

0.603900 

395 

0.066172 

0.073110 

0.150786 

0.190794 

595 

0.193028 

0.148581 

0.459972 

0.580161 

400 

0.067900 

0.074800 

0.155700 

0.195000 

600 

0.198800 

0.151200 

0.468700 

0.557600 

405 

0.070480 

0.075966 

0.159610 

0.183664 

605 

0.205169 

0.153866 

0.478353 

0.537136 

410 

0.073700 

0.077900 

0.164300 

0.169200 

610 

0.211100 

0.156500 

0.486400 

0.521400 

415 

0.076424 

0.078864 

0.167984 

0.148548 

615 

0.216858 

0.159086 

0.494538 

0.511057 

420 

0.078600 

0.080800 

0.174700 

0.134600 

620 

0.222700 

0.162000 

0.502200 

0.505100 

425 

0.082089 

0.081801 

0.181704 

0.119788 

625 

0.228475 

0.164825 

0.510005 

0.502012 

430 

0.084100 

0.088500 

0.189700 

0.107700 

630 

0.234800 

0.168000 

0.518500 

0.499900 

435 

0.086642 

0.090653 

0.199238 

0.097770 

635 

0.240596 

0.171098 

0.526200 

0.497272 

440 

0.089200 

0.092000 

0.208800 

0.090600 

640 

0.246900 

0.174100 

0.534400 

0.496100 

445 

0.089773 

0.094102 

0.216270 

0.086080 

645 

0.253322 

0.177629 

0.542413 

0.497777 

450 

0.092600 

0.094700 

0.223100 

0.083600 

650 

0.259300 

0.181100 

0.549400 

0.498900 

455 

0.094359 

0.096410 

0.227468 

0.082565 

655 

0.265378 

0.184315 

0.556135 

0.505712 

460 

0.097100 

0.097800 

0.235700 

0.082800 

660 

0.272000 

0.188000 

0.563400 

0.512500 

465 

0.098639 

0.099327 

0.245483 

0.085090 

665 

0.278704 

0.191696 

0.570739 

0.520993 

470 

0.101700 

0.100300 

0.254500 

0.091700 

670 

0.286100 

0.196200 

0.578800 

0.530300 

475 

0.103200 

0.101385 

0.259406 

0.105027 

675: 

0.293638 

0.200470 

0.587468 

0.543572 

480 

0.105800 

0.102200 

0.266000 

0.129000 

680 

0.301500 

0.204800 

0.596900 

0.556800 

485 

0.107605 

0.104301 

0.272314 

0.164101 

685 

0.308943 

0.209732 

0.604826 

0.568434 

490 

0.109300 

0.105600 

0.280700 

0.218400 

690 

0.316800 

0.214400 

0.613000 

0.577700 

495 

0.111456 

0.107259 

0.289338 

0.288877 

695 

0.324036 

0.219020 

0.621465 

0.584393 

500 

0.114000 

0.108700 

0.299500 

0.378700 

700 

0.332300 

0.223900 

0.628800 

0.589000 

505 

0.116402 

0.110247 

0.309143 

0.488499 

705 

0.339708 

0.228504 

0.635909 

0.590256 

510 

0.119100 

0.111800 

0.317800 

0.605000 

710 

0.347300 

0.233400 

0.643100 

0.589800 

515 

0.121316 

0.113665 

0.326200 

0.712526 

715 

0.354677 

0.238472 

0.649656 

0.589926 

520 

0.124500 

0.115700 

0.333700 

0.780500 

720 

0.363100 

0.243300 

0.655300 

0.588400 

525 

0.127445 

0.117235 

0.341678 

0.806134 

725 

0.369773 

0.248262 

0.660411 

0.589531 

530 

0.130900 

0.119300 

0.350100 

0.802300 

730 

0.377700 

0.252900 

0.666100 

0.590200 

535 

0.134559 

0.121210 

0.358632 

0.785143 

735 

0.377700 

0.252900 

0.666100 

0.590200 

540 

0.138500 

0.123300 

0.366900 

0.765700 

740 

0.377700 

0.252900 

0.666100 

0.590200 

545 

0.142668 

0.125380 

0.376320 

0.748110 

745 

0.377700 

0.252900 

0.666100 

0.590200 

550 

0.146700 

0.127600 

0.384200 

0.731600 

750 

0.377700 

0.252900 

0.666100 

0.590200 

555 

0.151142 

0.129769 

0.393096 

0.714550 

755 

0.377700 

0.252900 

0.666100 

0.590200 

560 

0.155600 

0.131900 

0.402000 

0.699300 

760 

0.377700 

0.252900 

0.666100 

0.590200 

565 

0.160788 

0.134540 

0.410504 

0.685223 

765 

0.377700 

0.252900 

0.666100 

0.590200 

570 

0.166300 

0.137100 

0.419300 

0.672500 

770 

0.377700 

0.252900 

0.666100 

0.590200 

575 

0.171310 

0.139224 

0.426955 

0.658712 

775 

0.377700 

0.252900 

0.666100 

0.590200 


TABU 0.23 

Spectra for wood (redwood and walnut) and bananas (yellow and ripe brown). 
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PIOUKI «.22 

Spectral plots for wood (redwood and walnut) and bananas (yellow and ripe brown). 



Mf Ull •.23 

Spectral plots for cucumber com (kernel and husk), and yellow Delicious apple. 






A 

Cucumber 

Com 

kernel 

Com 

husk 

Yellow 

Delicious 

apple 

A 

Cucumber 

Com 

kernel 

Com 

husk 

Yellow 

Delicious 

apple 

380 

0.021900 

0.038300 

0.063300 

0.033200 

580 

0.030700 

0.176800 

0.627600 

0.117400 

385 

0.021900 

0.038300 

0.063300 

0.033200 

585 

0.031047 

0.169011 

0.632909 

0.119696 

390 

0.021900 

0.038300 

0.063300 

0.033200 

590 

0.031300 

0.163600 

0.638000 

0.122700 

395 

0.024674 

0.039030 

0.056170 

0.032803 

595 

0.031559 

0.160106 

0.642686 

0.126235 

400 

0.024600 

0.039700 

0.053100 

0.031400 

600 

0.031700 

0.157200 

0.645900 

0.129700 

405 

0.025234 

0.041219 

0.058641 

0.031723 

605 

0.031939 

0.152870 

0.649673 

0.133033 

410 

0.025500 

0.043500 

0.065300 

0.032000 

610 

0.032200 

0.145600 

0.652300 

0.135200 

415 

0.025080 

0.045362 

0.072228 

0.032260 

615 

0.032502 

0.137547 

0.654657 

0.137289 

420 

0.029600 

0.048000 

0.079200 

0.033200 

620 

0.032800 

0.131500 

0.655600 

0.139400 

425 

0.029724 

0.050382 

0.088192 

0.033764 

625 

0.033191 

0.128401 

0.657084 

0.142280 

430 

0.029600 

0.051800 

0.097300 

0.034700 

630 

0.033800 

0.127800 

0.660300 

0.146400 

435 

0.029319 

0.053646 

0.106038 

0.035408 

635 

0.034258 

0.125512 

0.661430 

0.148518 

440 

0.029200 

0.055500 

0.108500 

0.036500 

640 

0.034500 

0.118500 

0.660500 

0.148000 

445 

0.029092 

0.056896 

0.106753 

0.037562 

645 

0.034589 

0.108264 

0.655692 

0.146080 

450 

0.029400 

0.060100 

0.105900 

0.038700 

650 

0.034600 

0.099200 

0.651200 

0.144300 

455 

0.029287 

0.061714 

0.108800 

0.039358 

655 

0.034391 

0.093511 

0.648822 

0.145401 

460 

0.029300 

0.063800 

0.117400 

0.040300 

660 

0.034300 

0.086000 

0.642700 

0.145500 

465 

0.029247 

0.064686 

0.128304 

0.041824 

665 

0.034495 

0.077683 

0.630899 

0.144514 

470 

0.029200 

0.065300 

0.132100 

0.042500 

670 

0.035200 

0.072800 

0.620800 

0.144900 

475 

0.029123 

0.065614 

0.127410 

0.043368 

675 

0.036288 

0.071492 

0.616484 

0.147287 

480 

0.029100 

0.066500 

0.120800 

0.045400 

680 

0.038100 

0.073800 

0.621500 

0.152500 

485 

0.029153 

0.067963 

0.125280 

0.046097 

685 

0.040346 

0.082869 

0.643807 

0.162758 

490 

0.029200 

0.070200 

0.143200 

0.047900 

690 

0.042700 

0.109100 

0.673800 

0.185200 

495 

0.028993 

0.074272 

0.174151 

0.049060 

695 

0.044845 

0.158451 

0.694311 

0.224842 

500 

0.029000 

0.081300 

0.220000 

0.050700 

700 

0.046900 

0.219400 

0.705000 

0.272000 

505 

0.029109 

0.092537 

0.273968 

0.053039 

705 

0.048847 

0.273197 

0.709440 

0.315162 

510 

0.029200 

0.109000 

0.330800 

0.055800 

710 

0.050900 

0.316800 

0.710800 

0.350800 

515 

0.029319 

0.130748 

0.384128 

0.059323 

715 

0.053041 

0.353899 

0.709191 

0.381746 

520 

0.029200 

0.155800 

0.430600 

0.064100 

720 

0.055300 

0.386000 

0.704400 

0.409300 

525 

0.029309 

0.180395 

0.467646 

0.069218 

725 

0.057767 

0.412935 

0.700084 

0.434222 

530 

0.029300 

0.199400 

0.497200 

0.075200 

730 

0.060000 

0.432700 

0.691400 

0.453200 

535 

0.029409 

0.210905 

0.521639 

0.080795 

735 

0.060000 

0.432700 

0.691400 

0.453200 

540 

0.029500 

0.217600 

0.541000 

0.086600 

740 

0.060000 

0.432700 

0.691400 

0.453200 

545 

0.029591 

0.222459 

0.557374 

0.092313 

745 

0.060000 

0.432700 

0.691400 

0.453200 

550 

0.029800 

0.225800 

0.568200 

0.098000 

750 

0.060000 

0.432700 

0.691400 

0.453200 

555 

0.029951 

0.226158 

0.576834 

0.103044 

755 

0.060000 

0.432700 

0.691400 

0.453200 

560 

0.030100 

0.221200 

0.585200 

0.107200 

760 

0.060000 

0.432700 

0.691400 

0.453200 

565 

0.030199 

0.212331 

0.598699 

0.110777 

765 

0.060000 

0.432700 

0.691400 

0.453200 

570 

0.030400 

0.200000 

0.611900 

0.113300 

770 

0.060000 

0.432700 

0.691400 

0.453200 

575 

0.030552 

0.186925 

0.620716 

0.114926 

775 

0.060000 

0.432700 

0.691400 

0.453200 


TAILS ••24 

Spectra for cucumber com (kernel and husk), and yellow Delicious apple. 








A 

Green 

pepper 

Lemon 

skin 

Lettuce 

Carrot 

A 

Green 

pepper 

Lemon 

skin 

Lettuce 

Carrot 

380 

0.094600 

0.120000 

0.095600 

0.061300 

580 

0.506000 

0.320900 

0.470200 

0.351200 

385 

0.094600 

0.120000 

0.095600 

0.061300 

585 

0.519237 

0.313389 

0.503690 

0.344201 

390 

0.094600 

0.120000 

0.095600 

0.061300 

590 

0.531400 

0.307900 

0.534200 

0.339500 

395 

0.093326 

0.126553 

0.085573 

0.059474 

595 

0.542477 

0.304229 

0.560528 

0.335552 

400 

0.092600 

0.127300 

0.079500 

0.059700 

600 

0.551400 

0.299600 

0.582000 

0.333000 

405 

0.094702 

0.123903 

0.073264 

0.060867 

605 

0.560581 

0.293765 

0.599601 

0.329360 

410 

0.099300 

0.121700 

0.070100 

0.065100 

610 

0.568400 

0.285200 

0.614500 

0.321800 

415 

0.101202 

0.117586 

0.066354 

0.066726 

615 

0.574100 

0.275822 

0.626105 

0.311316 

420 

0.100700 

0.117900 

0.064200 

0.070500 

620 

0.579000 

0.268800 

0.634800 

0.302500 

425 

0.101736 

0.115284 

0.062215 

0.072495 

625 

0.583564 

0.265061 

0.640823 

0.297483 

430 

0.103200 

0.113200 

0.060800 

0.074000 

630 

0.590500 

0.265100 

0.648800 

0.297000 

435 

0.104680 

0.111941 

0.059053 

0.074581 

635 

0.595442 

0.262884 

0.654552 

0.291332 

440 

0.105400 

0.116700 

0.058500 

0.076300 

640 

0.597600 

0.251500 

0.661600 

0.273800 

445 

0.105773 

0.124562 

0.058075 

0.079578 

645 

0.597485 

0.232761 

0.667237 

0.248506 

450 

0.105500 

0.135700 

0.058100 

0.083800 

650 

0.595600 

0.214100 

0.669000 

0.227800 

455 

0.105689 

0.140357 

0.057936 

0.089756 

655 

0.595415 

0.202363 

0.671217 

0.218046 

460 

0.108300 

0.144200 

0.058100 

0.092400 

660 

0.591000 

0.186700 

0.670300 

0.200500 

465 

0.115514 

0.146002 

0.058914 

0.096214 

665 

0.577792 

0.168095 

0.668242 

0.174213 

470 

0.120100 

0.147200 

0.059500 

0.098100 

670 

0.564700 

0.155700 

0.668200 

0.153800 

475 

0.119231 

0.146764 

0.059171 

0.099076 

675 

0.557360 

0.152540 

0.673788 

0.145053 

480 

0.116500 

0.146800 

0.058200 

0.100100 

680 

0.562900 

0.160900 

0.684000 

0.149800 

485 

0.118357 

0.150648 

0.058501 

0.103234 

685 

0.592999 

0.197868 

0.695115 

0.181254 

490 

0.122900 

0.158300 

0.059200 

0.110700 

690 

0.635100 

0.275000 

0.703600 

0.252800 

495 

0.132147 

0.171753 

0.060497 

0.123043 

695 

0.664002 

0.356104 

0.707697 

0.351446 

500 

0.146700 

0.192900 

0.062500 

0.141600 

700 

0.679900 

0.407300 

0.708100 

0.444300 

505 

0.167334 

0.221224 

0.064859 

0.167188 

705 

0.688680 

0.435447 

0.708358 

0.513878 

510 

0.192900 

0.253700 

0.066300 

0.197700 

710 

0.693400 

0.451500 

0.706100 

0.562800 

515 

0.223537 

0.287602 

0.068582 

0.232610 

715 

0.694346 

0.460376 

0.703404 

0.599962 

520 

0.257700 

0.316200 

0.070600 

0.267600 

720 

0.693800 

0.468000 

0.700900 

0.629400 

525 

0.289771 

0.337305 

0.073698 

0.297561 

725 

0.691971 

0.470848 

0.696467 

0.650883 

530 

0.320100 

0.350100 

0.078300 

0.321200 

730 

0.685300 

0.472100 

0.692000 

0.663800 

535 

0.344889 

0.357517 

0.086325 

0.338013 

735 

0.685300 

0.472100 

0.692000 

0.663800 

540 

0.367000 

0.361100 

0.098700 

0.350400 

740 

0.685300 

0.472100 

0.692000 

0.663800 

545 

0.386831 

0.364681 

0.120427 

0.360250 

745 

0.685300 

0.472100 

0.692000 

0.663800 

550 

0.403200 

0.365300 

0.154000 

0.368500 

750 

0.685300 

0.472100 

0.692000 

0.663800 

555 

0.417652 

0.363841 

0.203076 

0.372675 

755 

0.685300 

0.472100 

0.692000 

0.663800 

560 

0.433500 

0.358100 

0.262600 

0.372800 

760 

0.685300 

0.472100 

0.692000 

0.663800 

565 

0.452520 

0.350641 

0.325399 

0.372553 

765 

0.685300 

0.472100 

0.692000 

0.663800 

570 

0.473500 

0.341000 

0.382000 

0.367600 

770 

0.685300 

0.472100 

0.692000 

0.663800 

575 

0.490505 

0.330396 

0.429036 

0.358896 

775 

0.685300 

0.472100 

0.692000 

0.663800 
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Spectra for green pepper, lemon skin, lettuce, and carrot. 
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FIOIIII 0.24 

Spectral plots for green pepper, lemon skin, lettuce, and carrot. 



PI0URI 0.29 

Spectral plots for seeds (barley, lentil, and brown rice) and sand. 
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380 




WSM 

580 

0.236000 

0.539800 

0.392500 

0.448700 

385 





585 

0.242429 

0.545948 

0.396736 

0.458888 

390 




BbIeSBI 

590 

0.249000 

0.553400 

0.400400 

0.469100 

395 


0.181115 

0.145423 

0.089927 

595 

0.255451 

0.560970 

0.404006 

0.479434 

400 


0.180300 


mssrn 


0.260600 

0.567700 

0.406800 

0.489500 

405 

0.084282 

0.180089 

0.153924 

0.096172 

605 

0.265562 

0.574773 


0.499197 

410 

0.086500 

0.185400 

0.159100 

0.102900 

610 





415 

0.088768 

0.189811 

0.165533 

0.111717 

615 

0.276059 

0.584683 

0.414605 


420 

0.090300 

0.203800 

0.171900 

0.119400 

620 





425 

0.092494 

0.218619 

0.178945 

0.128920 

625 

0.282683 


0.417556 

0.533944 

430 

0.094600 

0.233500 

0.188100 

0.138700 

630 

0.286100 

0.600300 

0.419800 

0.542700 

435 

0.096399 

0.249696 

0.198876 

0.146069 

635 

0.288527 

0.605476 

0.421745 

0.550511 

440 

0.102800 

0.267700 

0.207600 

0.154400 

640 

0.288900 

0.612000 

0.423500 

0.558900 

445 

0.103286 

0.284197 

0.217063 

0.166592 

645 

0.288384 

0.618660 

0.425350 

0.566125 

450 

0.104500 

0.295900 

0.223700 

0.178300 

650 

0.289800 

0.625000 

0.426700 

0.572100 

455 

0.106980 

0.308261 

0.226014 

0.185299 

655 

0.292516 

0.632343 

0.428569 

0.578473 

460 

0.109300 

0.323300 

0.230800 

0.195200 



0.638600 

0.430700 

0.584900 

465 

0.112907 

0.338033 

0.234553 

0.206107 

665 

0.295585 

0.645523 

0.432834 

0.590971 

470 

0.115500 

0.352400 

0.238000 

0.215800 



0.653400 

0.434900 

0.597800 

475 

0.117150 

0.362057 

0.241322 

0.225346 

675 

0.299998 

0.659403 

0.438065 

0.604344 

480 

0.118200 

0.371300 

0.244600 

0.235600 

680 





485 

0.121176 

0.383362 

0.249995 

0.245545 

685 

0.323354 


0.442951 

0.617125 

490 

0.124700 

0.392900 

0.255600 

0.255800 






495 

0.129341 

0.404354 

0.262377 

0.265608 

695 




0.628701 

500 

0.134900 

0.416800 

0.268900 

0.275100 

n 





505 

0.142511 

0.428654 

0.276855 

0.285616 

705 

0.401009 

0.687402 

0.451710 

0.639836 

510 

0.151200 

0.440400 

0.284600 

0.296700 

710 

0.411800 

0.689600 

0.453800 

0.644200 

515 

0.161290 

0.450343 

0.293104 

0.306672 

715 

0.420478 

0.691080 

0.457019 

0.648103 

520 

0.169100 

0.458500 

0.302200 

0.316500 

720 

0.428700 

0.694800 

0.458300 

0.653300 

525 

0.176810 

0.464790 

0.310331 

0.327847 
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0.434835 

0.694626 

0.460946 

0.656041 

530 

0.182800 

0.470700 

0.318800 

0.338400 

730 

0.440400 

0.695400 

0.461700 

0.660000 

535 

0.188503 

0.475985 

0.327540 

0.348997 

735 

0.440400 

0.695400 

0.461700 

0.660000 

540 

0.194100 

0.482300 

0.336100 

0.359800 

740 

0.440400 

0.695400 

0.461700 

0.660000 

545 

0.199468 

0.490066 

0.344323 

0.370878 

745 

0.440400 

0.695400 

0.461700 

0.660000 

550 

0.204900 

0.497300 

0.353500 

0.381900 

750 

0.440400 

0.695400 

0.461700 

0.660000 

555 




0.393087 

755 


0.695400 

0.461700 

0.660000 

560 





760 


0.695400 

0.461700 

0.660000 

565 


0.519839 


0.415764 

765 


0.695400 

0.461700 

0.660000 

570 





770 

0.440400 

0.695400 

0.461700 

0.660000 

575 


0.532397 

0.387436 

0.438545 

775 


0.695400 

0.461700 

0.660000 


TABU 0.16 

Spectra for seeds (barley, lentil, and brown rice) and sand. 
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ID box filtering 

in frequency space, 357 
in signal space, 356 

ID Nonuniform Sampling Theorem, 500-501 
ID Uniform Sampling Theorem, 340, 342 
2D aliasing, 351 
2D brakets, 166 
2D continuous box, 235-238 
illustrated, 237 
2D convolution, 167 
2D discrepancies, 458 
for circular regions, 461 
disk, 460 

for edges, 460, 461 
illustrated, 459 
2D discrete box, 239 
2D distributions, 407-408 
2D Fourier transforms, 234-239 
continuous-time, 234-238 
discrete-time, 238-239 
magnitude of, 236 
phase of, 236 
2D impulse response, 168 
2D lattice, 415 
2D linear systems, 165-166 
2D reconstruction, 352-354 
filter, 353 
2D reptiles, 253 
2D sampling theory, 347-352 


2D signal, 165-169, 407 
Fourier transform of, 125 
in image rendering, 165 
impulse, 165 

2D spherical harmonics, 757 
2D Uniform Sampling Theorem, 351, 354 
2D wavelets, 291-296 
3D 

flux in, 614 

particle transport in, 596-618 
scattering in, 619-621 
3D energy transport, 591-596 
components of, 621-630 
3D image synthesis, 408, 1076 
4D objects, 1077-1078 
4D space-time, 1078 

A 

absence of bloom, 566 
absolute integrable function, 194 
absolutely summable function, 227 
absorption, 592-593, 623 
curve, 57 
explicit flux, 594 
flux, 623 
illustrated, 622 
acceleration methods, 1026 
accommodation, 9 
achromatic channel, 44 
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active interval, 130 
adaptation, 18 
range of, 19 
rod and cone, 21 

adaptive hierarchical integration, 495 
adaptive HR, 961-964 
adaptive refinement, 376 
biased, 377 
defined, 409 

sampling density and, 487 
unbiased, 380 

See also refinement; refinement algorithm; 
refinement geometry 

adaptive sampling, 327, 371, 376-381, 391, 466 
defined, 371 
implementation of, 376 
intuitive nature of, 375 
methods, 376 
point, 466 
See also sampling 
adaptive supersampling, 243 
reptiles for, 420 
See also sampling 
adjoint equations, 861, 863 
adjoint kernel, 798 
adjoint operators, 797, 861 
aerial perspective, 40 
albedo, 592 
aliases, 193, 340, 343 
aliasing, 28, 174, 343, 1033 
2D, 351 
anti, 193 
coherent, 381 
controlling, 118 
defined, 117, 497 
effects, 118, 380 

hemicube assumption, 927, 928, 930 
high frequencies and, 380 
high-frequency foldover, 381 
noise and, 398-404 
reason for, 118 
from regular sampling, 414 
structured, 1036 
artifacts, 375 
structures, 412 

trading for noise, 369, 371-375 
unstructured, 381 
See also anti-aliasing 
alpha filters, 520 


alpha measures, 305 
alpha-trimmed mean, 521 
alternating nonlinear projections onto convex 
sets, 504 

ambient component, 726 
ambient light, 725, 727, 738 
ambient term, 913 

American National Standards Institute (ANSI), 
649, 678 

standard definitions, 1135 
Ames room, 40 
amplitude 

of electric field, 550 
surfaces of constant, 550 
wavelet, 265 

analysis equation, 200, 214, 224 

See also Fourier transform; synthesis 
equation 

analysis window, 247 
analytic form factors, 1113-1134 
See also form factors 

analytic signals. See continuous-time (CT) 
signals 

angle factor. See form factor 
angle-restrictive filters, 75 
angular-momentum quantum number, 683 
anion, 683 

anisotropic effects, 518 
anisotropic function, 764 
anisotropic shading models, 740-743 
anisotropy, 740-744 
ANLAB color system, 63 
anomalous dispersion, 568 
anti-aliasing, 193 
analytic, 333 
color; 1021 
in pixels, 332-333 
See also aliasing 
antibonding orbitals, 696 
antithetic variates, 326 
defined, 326 
estimand, 328 
estimator, 328 
variance for, 326 
aperiodic box, 206 
aperiodic sampling, 373, 381-385 
characteristics, 381 
See also sampling 
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aperiodic signals, 131,197 
with active interval, 197, 198 
Fourier series coefficients for; 199 
Fourier transform of, 204 
See also periodic signals; signals 
aperture, 1013, 1021 
approximations 
constructing, 225 
discrete Fourier series, 225 
error function for, 801 
error monitoring, 800 
finite-space, 863 
flux, 659 

hazy Mie, 760-761 
integral, 227 
jittered hexagon, 443 
Kirchoff, 742 
murky Mie, 760-761 
Neumann series, 1046 
numerical, 808-817 
Riemann sum, 312 
solid angle, 602, 956 
Stirling’s, 707 
Tchebyshev, 830-831, 869 
thin lens, 1014 
two-coefficient, 276 
area bisection, 485-490 
Argand diagram, 138-139 
atmospheric modeling, 764-769 
Comette function for; 767 
TTHG for; 768 
atomic number 682 
atomic structure, 682-690 
atom model, 682 
atoms, 683 
bonds, 695 
excited state of, 690 
See also specific elements 
autocorrelation, 382 

Fourier transform of, 382 
See also power spectral density (PSD) 
autocorrelation function, 402, 403 
measure of support for, 403 
autocorrelation length, 744 
automatic rules, 809 
“average observer,” 660 
avoidance singularity method, 865 
azimuth angle, 554 


B 

backward ray tracing, 1023 
Banach space, 1088 
bandlimited, 337, 340-341 
reconstruction formula, 342 
signal, 340-341, 363 
band-pass filters, 219-220 
bandwidth, 411 
high, 409, 543 
local, 376, 463 
bar chart functions, 177 
basis, 178 

barrier constraints, 1068-1069 

base pattern. See initial sampling pattern 

bases 

ID, 188, 189 

representation, in lower dimension, 186-191 
base samples, 376 
basis functions, 175-186, 814 
ID, 186, 187 
bar chart, 178 
in column vectors, 190 
complex exponential basis, 184-186 
discrete, 238 
dual basis, 182-184 
linear; 983 

orthogonal families of, 179-182 
projections of, 176-179 
points in space, 175-176 
rectangular; 837 

for rectangular wavelet decomposition, 292 
scaled, 188 

for square wavelet decomposition, 293, 295 
STFT, 251 
summed, 188 

two-parameter family of, 251 
weighted, 182 
basis representation, 176 
basis vectors, 176 

linearly independent, 179 
orthogonal, 179 
transformed, 813, 832 
BDFs, 872-873 
beams, 108 
blanked, 97 
at scan line, 97 
beam tracing, 1008 

See also ray tracing; tracing 
Becker-Max algorithm, 785 
Beckmann distribution function, 737 
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Beckmann theory, 737 
benign singularities, 865 
Bernoulli trials, 378 
best candidate algorithm, 430-431 

decreasing radius algorithm vs., 435-437 
examples, 433 

Fourier transform and, 432, 433 
pseudocode, 431 
two-stage, 454 
BF1 refinement, 970 
implementation, 971 
BF refinement, 963 
bias, 302, 377, 379 
correcting for, 380 
geometry for analyzing, 378 
isolating, 377 
removing, 379 
bias adjustment, 99 

bidirectional ray-tracing algorithm, 1044 
bidirectional ray-tracing methods, 1039-1044 
bidirectional reflectance distribution function. 
See BRDF 

bidirectional scattering distribution function. 
See BSDF 

bidirectional scattering-surface 

reflectance-distribution function. See 
BSSRDF 

bidirectional transmission distribution function. 
See BTDF 

binary transfer mode, 1134 
binocular depth, 35-37 
binomial distribution, 1102 
binomial theorem, 570, 571 
blackbodies, 705-708 
blackbody emission, 873 
blackbody energy distribution, 708-715 
blackbody term, 873 
black field. See uniform black field 
black matrix CRT, 75 
blind Monte Carlo methods, 207-219 
crude, 307-308, 328 
multidimensional weighted, 315-319 
quasi, 310-312, 328 
rejection, 308-309, 328 
stratified sampling, 309-310, 328 
types of, 307 
weighted, 312-315, 328 
See also Monte Carlo methods 
blind spot 
defined, 9 


demonstrating, 11 
diagram, 12 

blind stratified sampling, 309-310 
efficiency of, 310 
estimand, 328 
estimator, 328 
problem with, 310 
variance for, 310 
See also stratified sampling 
Blinn-Phong shading model, 726-731 
defined, 728 
bloom, 1055, 1062 
absence of, 565-566 
blooming filter, 1062-1063 
blue-yellow chromatic channel, 44 
blurring, 518, 1062 
Boltzmann equation, 628, 642, 861 
bonding orbitals, 696, 698 
bond orbitals, 701 
bond order, 697 
bonds, 695 

ionic, 695-696 
molecular-orbital, 696-704 
types of, 695 
Boolean function, 415 

Bose-Einstein distribution law for photons, 708 
Bose-Einstein statistics, 705-708 
developing, 705-706 
function of, 705 
See also blackbodies 
bosons, 705, 740 
bounces, 849-850 

illustrated, 850, 852, 853 
multiple, 851 
numbers of, 857 
particle moving after, 850 
probability of, 851 
remaining, 856 

boundary conditions, 589, 630-635 
differential equations and, 630 
explicit, 633-634 
finding, 630 
free, 633 

implicit, 634-635 
mixed, 635 
periodic, 633 
reflecting, 634 
specifying, 631, 632 
types of, 632-635 
bounding vectors, 318 
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bounding volume hierarchies, 1026-1027 
illustrated, 1027, 1028 
bounds 

conservative, 800 
error, 800 
ideal, 800 
probable, 800 
“bow tie,” 490 
boxes, 365 

ID filtering, 356-357 
dilation equation applied to, 256 
discrepancy calculation for, 457 
enlarging, 351 
frequency-space, 505 
half-integer-sized, 293 
half-width, 254, 255 
multiplying in time domain with, 249 
transform of, 355 
box filter; 534 

box functions, 153-154, 256, 364, 894 
box reconstruction filters, 354-358 
box signals, 153-154 
2D, 235-238 
continuous, 153 
discrete, 153 

Fourier series for, 203-204 
Fourier transform for, 204-206 
See also signals 
box spectrum, 206-208 

Fourier transform of, 207, 342 
illustrated, 207 
width, 208 
box window, 247 
braket notation, 1088-1089 
2D, 166 
arguments, 144 
properties, 145 
in signal processing, 145 
BRDF, 662, 663-667, 722, 873 
anisotropic, 757 

around volumetric scattering point, 756 

combining, 675 

composite, 675 

decomposing, 738 

as function of incident angle, 756 

geometry, 666 

in Hanrahan-Krueger multiple-layer model, 
779 

in HTSG model, 744 
with normal distribution, 755 


normalized, 666-667 
precomputed, 753-757 
properties, 666-667 
reciprocity and, 666 
smooth, 756 

with specular component, 757 
sphere, 723, 724 
splitting, 738 
Ward shading model, 751 
breadth-first refinement, 494 
Brewster’s angle, 735, 877 
Brewster’s law, 735 
brightness, 1059, 1067 
defined, 1068 
monitor control, 98, 99 
perceived, 1058 
peripheral, 1068 
pixel values, 1058 
subjective, 1058 
See also contrast; CRT display 
brils, 1058 
BSDF, 722 
BSSRDF, 664, 873 
BTDF, 722, 873 
bump mapping, 780 
redistribution and, 785 


C 

camera models, 1013-1021 
candelas, 19 
candidate list, 1026 
carbon, 687, 701 
Cartesian product, 293-294, 614 
operator, 137 
space, 137 
Cartesian sum, 137 

combining spaces by, 282 
cathode, 560 
cation, 683 

Catmull-Rom spline, 531 
Cauchy-Schwarz inequality, 400 
Cauchy sequence, 1088 
Cauchy’s formula, 570-572 
coefficients for, 572 
cells, 484 

regions in, 483, 484 
See also regions; samples 
centered variance, 399 
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central-star reconstruction, 511 
notation for, 512 
centroid, 399 
certain event, 1094 
channels, 43-44 
achromatic, 44 
blue-yellow, 44 
red-green, 44 

characteristic equation, 589-590 
characteristic expression, 1042 
characteristic functions, 794 
child nodes, 959, 960, 961 
See also nodes 
chroma, 66 

chromatic aberration, 14 
chromaticity diagram, 49 
illustrated, 50 
chromophores, 770 
CIE color matching, 44-51 
chroma, 66 
experiment, 46 
functions, 1170 
hue-angle, 65 
phosphors, 102 
r, 6 curves, 46 
XYZ tristimulus curves, 786 
See also colors 
CIE standard, 1152-1161 

daylight functions, 1176-1177 
file format, 1154 
illuminants, 1173-1174 
notation, 1153 

observer matching functions, 1169 
See also IES standard 
ciliary body, 8-9 

relaxing/tensing, 34 
circle of confusion, 1018,1019 
circularly birefringent, 557 
circularly dichroic, 557 
circular shutter 1051 
classical radiosity, 886, 888-900 
assumptions, 893 
box functions, 894 
collocation solution, 891-892 
extensions to, 979-982 
Galerkin solution, 892-893 
Gauss-Seidel iteration and, 907-909 
higher-order radiosity, 899-900 
intuition for, 895 
solution, 893-899 


strength of, 1045 
weakness of, 1045 

See also radiosity; radiosity algorithms 
classical radiosity equation, 896 
classical ray tracing, 986, 1010 

characteristic expression for, 1042 
defined, 1010 
illustrated, 1010 
paths modeled by, 1043 
power of, 1044 
strata in, 1011-1011 
weakness of, 1044 
See also ray tracing 
clipping, abrupt, 524 
closed Newton-Cotes rules, 812 
clustering interactions, 939 
clusters of four, 85-89 
clusters of two, 89-94 
coarse-to-fine operator, 269 
coefficient of variation, 1101 
coexistence singularity method, 865, 868 
collisions 

probability of, 584 
types of, 621 
collocation, 825 

classical radiosity and, 891-892 
general formulation of, 828 
matrix elements for, 839 
points, 827, 891 
polynomial, 825-830 
quadrature rule and, 825-826 
color anti-aliasing, 1021 
color bleeding, 1040, 1045 
color cloud chamber; 1079 
color computation, 66 
color contrast, 42, 44 
colored reflectivities, 110 
color grid, 163 
colormap correction, 112 
color matching, 44-51 
experiments, 47 
r, g , b curves, 46 
x, y, 2 , 48 

See also CIE color matching 
color opponency, 42-44 
defined, 43 
schematic, 43 
color picking system, 68 
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colors 

3D linear space of, 49 
advancing, 14 
calculation of, 106 
comet; 513 

Euclidean distance between, 65 
interpolating, 69 
with same distance metric, 65 
shading and, 786-788 
See also Macbeth ColorChecker 
color shifting, 750 
color spaces, 59-68 
HSL , 67, 68 
HSV , 67, 68 
L*a*6*, 63-66 
L*u*v*, 59-66 
reference white defined, 60 
RGB, 66, 67,100-106 
RGB color cube, 66, 67 
shift in, 60 
uniform, 60 
XYZ, 59-60 

comb function. See impulse train 
compact support, 130, 131 
completeness, 176, 1088 
completion phenomena, 11 
complex conjugate, 139 
complex exponentials, 140-143 
as basis, 184-186 
discrete, 222 
eigenfunction, 143 
Gaussian and, 209 
LTI systems and, 164 
matrix of, 223 
properties of, 141 
shorthand, 143 
complex functions 
complex-valued, 139 
real-valued, 139 
complex-linear systems, 134 
complex numbers, 138-139 
defined, 138 
magnitude of, 139 
on Argand diagram, 138-139 
See also real numbers 
complex refractive index, 553 
complex scaling factor; 169,215 
component vectors, 175 
composite operator; 1058 
compression, 191 


lossy, 191 

representation, 193 
wavelet, 274-276 
function of, 275 
illustrated, 274-275 
methods of, 275-276 
computer-directed machining, 1078 
computer graphics 

aliasing in, 341, 537-538 

color computation in, 66 

filters and, 537 

importance sampling in, 395 

interval analysis and, 372-373 

mathematically continuous signals in, 331 

model surfaces, 567 

singularities and, 864, 868 

warping and, 503 

wavelets in, 245 

See also image synthesis; synthetic images 
conditional probability, 1095 
conducting band, 716 
cone of confusion, 1019-1020 
cones, 9, 15 

adaptation, 21 
contrast sensitivity and, 24 
in daylight, 19-20 
of light from a point, 1018 
photopigment in, 16 
polyhedral, 1009 
response curves for, 16 
response of, 22 
types of, 16 
cone tracing, 1009 
advantages of, 1035 
illustrated, 1009 
confidence interval, 476 
confidence refinement test, 476-479 
configuration factor. See form factor 
constant function, 764 
constant index of refraction, 713-714 
See also index of refraction 
constant Q resonant filter 287 
constants and units, 1135-1137 
constant-time filtering methods, 1035 
constrained least-squares optimization, 1067 
constraints, 1067-1069 
barriei; 1068-1069 
design, 1068 
physical, 1069 
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continuous signals, 128-129 

Fourier series coefficients for, 198 
continuous-time (CT) signals, 128-129 
arguments, 129 
defined, 128-129 
Fourier transform, 231 
reconstructing, 363 
scan line image modeled by, 360 
See also discrete-time (DT) signals 
continuous-time Fourier representations, 
191-192 

continuous-time Fourier transform (CTFT), 
197-203 
2D, 234-238 
defined, 200 
pairs, 231, 232 

See also discrete-time Fourier transform 
(DTFT); Fourier transforms 
contour integration, 919-921 
geometry, 921 
contrast, 79 

for cluster of four, 91 
for cluster of two, 95 
computing, 89, 92 
CRTs, 79, 1059 
gloss, 565, 566 
lightness, 31-32 
maximum available, 1060 
monitor control, 98, 99 
refinement test, 467 
relative, 1061 
spot spacing and, 89, 92 
test patterns, 86 
for white field, 96 
See also brightness; CRT display 
contrast sensitivity, 23-28 
experiment, 24 
rods and cones and, 24 
contrast sensitivity function (CSF), 24 
for adult, 25, 27 
for different species, 45 
for infant, 25, 27 
with respect to orientation, 28 
in response to frequency adaptation, 25, 26 
for scotopic/photopic vision, 24, 26 
contribution, 855 
finding, 860 

control variates, 325-326 
defined, 325 
estimand, 328 


estimator, 328 
variance for, 326 
convergence, 194-197 
condition, 506 
criteria, 903 
increased speed of, 318 
convolution, 119, 155-165 
2D, 167 

algorithm implementation, 160 
of discrete signals, 161 
discrete-time, 164-165 
example, 158 
filtering operation as, 250 
Fourier transform of, 216 
in frequency domain, 337, 383 
frequency-space equivalent of, 219 
of Gaussian bump, 170 
importance of, 160 
as integration, 123 
intuitive interpretation, 218 
manual evaluation, 161 
multiplication and, 123 
operator, 156 

physical example, 160-161 
properties, 161-162 
results, 159 
signal, 120, 122 
for three functions, 162 
transform pair, 233 
with triangles, 515 
See also impulse response 
Cook’s filter, 524-526 

See also reconstruction filters 
Cook-Torrance shading model, 731-740 
See also shading; shading models 
cornea, 6-8 
corner refinement, 483 
Cornette function, 768 
correlation coefficient, 1102 
coupled fields, 549 
covariance, 1101 
critical angle, 575 
critical flicker frequency (CFF), 18 
cross-correlation, 382 
CRT displays, 4, 71-76, 344 
black matrix, 75 
contrast, 79, 1059 
display spot interaction, 76-97 
electron guns, 71-72 
filters, 75-76 
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CRT displays (continued) 
gamma correction, 100 
gamut mapping, 106-111 
guard band and, 75 
image creation, 73 
monochromatic, 76 
phosphors in, 72-73, 101 
radiance values perceived on, 1055 
RGB color space, 100-106 
schematic, 72 

shadow mask pitch and, 74 
subset of colors on, 49 
tension of, 97 
See also displays; monitors 
crude Monte Carlo, 307-308 
box width, 313 
convergence properties, 314 
estimand error; 328 
estimator; 308, 328 
See also Monte Carlo methods 
crystalline lens, 8 
accommodation, 9 
thickness of, 34 
crystals, 579 
cubic B-spline, 531 

cubic-tetrahedral projection method, 933-934 
cube positioning for; 933 
cumulatively compatible, 492 
cutoff frequency, 337, 342, 363 
cylindrical-scratch model, 743 

D 

dart-throwing, 429 
2D discrepancies, 458 
drawbacks to, 429-430 
illustrated, 430 
patterns, 459-462 
pixel errors, 462 

Daubechies first-order wavelets, 279-280 
daylight functions, 1176-1177 
DC component, 210 
decibels (dB), 235 
decomposition, 272 
filters, 267 
deconstructions 
rectangular, 291 
square, 291 

decreasing radius algorithm, 432-435 
best candidate algorithm vs., 435-437 
geometry of, 435 


pseudocode, 434 
definition symbol, 139 
degenerate interval, 136 
degenerate kernels, 801-804 
defined, 801 
See also kernels 
del operator, 627 
delta form factors, 925, 934 
delta functions 

Dirac, 148, 150-151,674 
discrete, 874, 875 
2D, 166 

jitter transform and, 384 
density function, 322, 1099, 1100 
perfect, 325 
depth of field, 1020 
depth perception, 33-42 
binocular depth, 35-37 
defined, 33 

monocular depth, 37-41 
motion parallax, 41-42 
oculomotor depth, 34-35 
types of, 34 

design constraints, 1068 
destructive interference, 549 
device-directed rendering, 1069-1072 
diamond lattice, 420 
defined, 420 
over set of pixels, 422 
diamond pattern sampling, 489 
didymium glass spectrum, 76 
differential equations, 544, 590, 635 
boundary conditions and, 630 
Kubelka-Munk, 774-776 
differential radiance, 663 
differential solid angle, 651 
diffraction, 563 
defined, 545 
effects of, 563 
See also interference; light 
diffuse adjustment factor, 749 
diffuse distribution, 1045, 1046 
diffuse plus specular, 728-731 
diffuse reflection, 564 
directional, 744 
perfect, 672 
uniform, 744 
See also reflection 
diffuse reflectors, 889 
diffuse transmission, 566, 567 



110 


INDEX 


digital signal processing, 117-118 
dilation equation, 253-255 
application of, 253 
applied to box, 256 
coefficients, 254, 267 
conditions from, 278 
inverse square of, 286-287 
in Fourier domain, 285 
functions and, 255 

See also wavelets; wavelet transforms 
dilogarithm, 1133 
diopter, 6 
defined, 6 

Dippe filter, 527, 528 

See also reconstruction filters 
Dirac delta function, 148, 674 
behavior, 151 
plotted, 150-151 
direct current transmission, 210 
direct illumination, 723, 1002-1007 
illustrated, 1005 
indirect illumination vs., 1006 
See also illumination 
directional stratum, 1001 
directional subdivision, 1026, 1030-1031 
illustrated, 1030 
direction-based methods, 1001 
direction-driven strata sets, 993-996 
direction hemisphere, strata on, 992 
direction of propagation, 550 
directions, 598-599, 993 
around points, 607 
front outgoing, 631 
incident, 724 
locating, 599 
set of, 996, 997 
direction sets, 606-613 
hemispherical, 608 

combinations of, 609, 611 
combining, 610 
interpretation of, 610-612 
orientations of, 612 
incoming, 608, 631 
outgoing, 608, 631 
See also directions 
direction vectors, 710 
direct parameters, 450 
direct strata, 1006 
overlap of, 1007 
Dirichlet criteria, 194 


in digital domain, 227 
discontinuity, 195-197, 975 
detecting, 517, 975 
Fourier series, 195 
meshing, 975 
radiosity and, 975 
signals bouncing into/out of, 196 
discrepancy, 456 
2D, 458 

for circular regions, 461 
disk, 460 

for edges, 460, 461 
illustrated, 459 
calculating, from boxes, 457 
circular regions and, 459 
definition, 456 
for different patterns, 458 
generalized, 456 
measurement, 456 
pixel errors and, 462 
discrete-time (DT) signals, 129-130 
convolution of, 161, 164-165 
sum, 165 
defined, 130 
sampled, 130 

See also continuous-time (CT) signals 
discrete-time Fourier representatives, 222-229 
discrete-time Fourier series, 222-225 
pair, 224 

discrete-time Fourier transform (DTFT), 
225-229 
2D, 238-239 
of convolution sum, 231 
example, 228-229 

See also continuous-time Fourier transform 
(CTFT); Fourier transforms 
displacement texture method, 781 
display-adapted operator, 1057 
display-adapted visual system, 1056 
display compensation, 1056, 1057 
display device operator, 1059 
display reconstruction filters, 120 
anomalous, 568 
See also reconstruction filters 
displays, 71-111 

adaption level, 1060 
light-emitting, 71 
light-propagating, 71 
physical limits on, 1054 
RGB color space, 100-106 
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displays ( continued) 
technology of, 4 
See also CRT displays; monitors 
display spot interaction, 76-97 
centered spot, 80, 81 
clusters of four; 85-89 
clusters of two, 89-94 
discussion, 97 

display measurement, 79-82 
interspot spacing, 78 
pattern description, 82-84 
profile, 76-78 
spot brightness, 80 
spot patterns, 80 
two-spot, 78-79 
uniform black field, 85 
uniform white field, 94-96 
See also phosphors; spots 
distinctness of image, 566 
illustrated, 565 

distribution functions, 306, 307,447, 
1098-1099 
Beckmann, 737 
cumulative, 1099 
joint, 1099 

See also BRDF; BSDF; BTDF 
distribution ray tracing, 492,1011, 1021-1035 
defined, 1011 
illustrated, 1011 

for indirect illumination gathering, 1034 
overview, 1034, 1035 
path capture and, 1042 
stratification and, 1031 
See also ray tracing 
distributions, 148 
2D, 407-408 
binomial, 1102 
blackbody energy, 708-715 
defined, 1102 
diffuse, 1045, 1046 
estimand, 302 
exponential, 1102 
Fermi-Dirac, 694, 695 
Gaussian, 753 
indirect parameter, 454 
jitter; 426-427 
joint, 455 
line, 934 
normal, 1102 
normality of, 307 


parent, 302, 307 
probability, 1102-1103 
radiance, 871, 886 
rectangular; 1102 
sampling, 302 
Student’s, 305-306 
dithering, 17 

divergence theorem, 627-628 
divide and conquer singularity method, 865, 868 
dodecahedron, 600 
domain, 135, 148 
defined, 148 
See also spaces 

dot product. See inner products 
double-slit experiment, 545-549 
defined, 546 
illustrated, 547 
downsampling, 271 
driving function, 636, 793, 966 
radiosity and, 966 
dual basis, 182-184 
defined, 182 

for function analysis, 183 
for function synthesis, 183 
projection coefficients, 182 
See also basis functions 
duality, 213-214 
defined, 213 
importance of, 214 
dyadic points, 260, 281 
dyes, 770 

See also pigments 
dynamic equilibrium, 626 
dynamic Poisson-disk patterns, 440-443 
defined, 440 

hexagonal jittering, 441-443 
point-diffusion, 440-441 
dynamic stratification, 496 
illustrated, 1037 
See also stratification 

B 

edges 

clumping at, 534 
intercepts of, 513 
linear, 510 
eigenfunctions, 143 

of 2D systems, 168-169 
of LT1 systems, 163-164 
eigenvalues, 143 
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eigenvectors, 281 
electric fields, 549 
amplitude of, 550 
illustrated, 553 
electromagnetic, 549 
electron guns, 71-72 
arrangement of, 73 
blue, 73 

dedication of, 73 
green, 73 

magnetic fields and, 100 
red, 73 

electronic transitions, 687, 692 
electrons, 682, 683-687 
bound, 683 
free, 683 

ground state of, 690 
in parallel spins, 687 
stable, 683 
unstable, 683 

See also photons; quantum numbers 
elements 

building, by quantum rules, 686 
computed on demand, 907 
importance-gathering, 970 
importance-shooting, 970 
orthogonal, 1089 
periodic table of, 1137 
perpendicular; 1089 
relaxing, 903 
residual, 904 
See also specific elements 
ellipses 

description of, 554 
vibration, 556 
ellipsoid 

of energy modes, 711 
frequency, 710 

ellipsometric parameters, 556, 558 
emission, 623 
blackbody, 873 
BRDF, 723 
explicit flux, 594 
illustrated, 622 
luminescent, 704-705, 716 
phosphorescent, 874 
radiance, 723 
responsive, 681 
sphere, 723, 724 
spontaneous, 681 


thermal, 704 
volume, 621 
emmetropic eyes, 13 
empirical shading models, 747-753 
programmable, 752-753 
Strauss, 747-750 
Ward, 750-752 
enclosure, 888 
energy 

absorption, 873-874 
arriving, 652 
blackbody, 708-715 
definition, 654 
distribution of, 691 
Fermi level of, 693 
intemuclear potential, 696 
luminous, 660, 661 
modes, 711 
moving, 652 
radiant, 651, 652 
states, 693 
energy deficit, 690 
energy transport, 581-644 
3D, 591-596 

components of, 621-630 
boundary conditions and, 630-635 
integral form, 635-643 
light transport equation and, 643-644 
model, 626-629 
particle density and, 583-584 
particle flux and, 583-584 
rod model and, 582-583 
scattering and, 584-587 
scattering-only particle distribution 
equations, 587-591 
from source to receiver, 657 
See also energy 
ensemble, 382 
equations 

adjoint, 861, 863 

Boltzmann, 628, 642, 861 

classical radiosity, 896 

for dielectric-dielectric interface, 735 

differential, 544, 590, 630, 635, 774-776 

dilation, 253-255, 256,267,278, 285-287 

Fredholm, 839 

Fresnel’s, 732-737 

gray, 877 

homogeneous, 636 
inhomogeneous, 636 
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integral, 543, 635, 636, 791-869 

integro-differential, 635, 636 

light transport, 643-644 

matrix, 833, 900-906 

Maxwell’s, 549, 551-552 

occupancy, 694 

particle distribution, 587-591 

particle transport, 628, 630 

Phong shading, 726 

polynomial collocation, 829-830 

radiance, 543, 544, 791, 794, 871-882, 885 

ray, 1023 

reflectance, 667, 669, 672 
rendering, 879 
Snell’s law, 576 

source-importance identity, 863 
sphere, 1023 

synthesis, 193, 198, 200, 202, 206, 224, 226 
transport, 582, 637-643, 842 
undetermined coefficient, 811 
equilibrium, 622, 626 
dynamic, 626 
flux change and, 626 
equivalence classes, 176 
error annihilation rules, 810 
error bound, 800 
tightness, 800 

error-correcting codes, 1075 
error diffusion, 440 
error function, 801 
error vectors, 832 
illustrated, 834 
estimand, 300 
distribution, 302 
value of, 300 
variance, 302 

See also Monte Carlo methods 
estimand error; 303, 304 
antithetic variates, 328 
blind stratification, 328 
control variates, 328 
crude Monte Carlo, 328 
formula for, 304 
halving, 305 

importance sampling, 328 
informed stratification, 328 
quasi Monte Carlo, 312, 328 
rejection Monte Carlo, 328 
weighted Monte Carlo, 314, 328 


estimated value, 302 
estimator 303 
antithetic, 328 
blind stratification, 328 
control variates, 328 
crude Monte Carlo, 308, 328 
importance sampling, 328 
informed stratification, 328 
linear, 303 

minimum-variance, 303 
linear; 303 

quasi Monte Carlo, 328 
rejection Monte Carlo, 328 
unbiased, 303 
variance of, 304 
weighted Monte Carlo, 328 
See also Monte Carlo methods 
Euler’s identity, 140 
Euler’s relation, 142 
excitant, 690 

excited-state energy levels, 687 
exitance, 652, 653 
expansion, 191 
methods, 810 
expected density, 583 
expected value, 1100 
explicit approximation, 881, 882 
function of, 887 

explicit boundary condition, 633-634 
explicit expressions, 595 
explicit flux, 593-595, 791 
exponential decays, 715 
exponential distribution, 1102 
extended form factors, 979 
external variance, 497 
extinction coefficients, 554, 734 
for aluminum, silver; 1166, 1167 
for copper; gold, 1166, 1168 
eye. See human eye 
eye pass, 1047 

P 

factorization singularity method, 865, 867 
Farrell device, 923-924 
illustrated, 924 
pointers, 923 
See also form factors 
farsighted, 14 

fast Fourier transform (FFT), 240 
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features 

correspondence of, 35 
extraction of, 35 
matching, 35 
shifting of, 37 

feedback rendering, 1053, 1064-1072 
See also rendering 

feedback-rendering algorithms, 1065 
Fermat’s law, 1105, 1106, 1107,1108 
Fermi-Dirac distribution, 694 
illustrated, 695 

Fermi-Dirac statistics, 691-694 
Fermi level, 693 
fermions, 685, 691 
distribution of, 691 
fields 

coupled, 549 
electric, 549, 550, 553 
magnetic, 549, 553 
out of phase, 554, 555 
in phase, 554, 555 
polarization of, 556 
time-harmonic, 549-550 
filter bank, 287 
illustrated, 289 
filter criteria, 517, 518 
anisotropic effects, 518 
blurring, 518 
reconstruction error, 518 
ringing, 518 

sample-frequency ripple, 518 
filter function, 395 

points conforming to, 397 
filtering, 155, 411,1062 
as convolution, 250 
defined, 216 
example, 216 
image box, 358 
local, 517-521 
low-pass, 516 
techniques survey, 537 
filters, 119,445 
alpha, 520 

alpha-trimmed mean, 521 
angle-restrictive, 75 
band-pass, 219-220 
blooming, 1062-1063 
box, 534 

common, 219-221 
computer graphics and, 537 


constant Q resonant, 287 
Cook’s, 524-526 
decomposition, 267 
defined, 155 
designing, 299, 358 
Dippe, 527, 528 
display reconstruction, 120 
dividing, 395 

by equal-sized regions, 447 
by equal volume, 446 
fine-to-coarse, 267 
FIR, 220, 359 
flat-field response of, 519 
Gaussian, 359 
high-pass, 219, 267, 270 
ideal, 219, 221 
HR, 220, 359 
importance of, 216 

low-pass, 120,123, 219, 263, 267, 363 
Max’s, 527-529, 530, 531 
Mitchell and Netravali, 529-532 
multistage, 535 
neutral-density, 76 
noise sensitivity of, 520-521 
nonuniform cubic B-spline, 524 
normalization of, 519-520 
over domain tiled with rectangles, 514 
Pavicic, 525 
piecewise cubic, 535 
reconstruction, 120, 122-123, 342, 353, 
354, 363, 394, 526-527 
rings, 221 
scaled, 396 
selective, 76 
separable, 514 
shapes of, 358-359 
space-invariant, 517 
space-variant, 517 
system, 155 

weighted-average filter, 532 
windowed, 220 
See also filter criteria; filtering 
fine-to-coarse filters, 267 
finite-dimensional space, 818 
finite discontinuity, 194 
illustrated, 195 

finite-elements approach, 516-517 
finite impulse response (FIR) filters, 220, 359 
with good frequency selectivity, 359 
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finite-support signal, 158-159 
illustrated, 159 
finite-to-finite transfer, 918 
first-order matching, 278 
fission, 585 
fixation point, 34 
fixed points, 281 
flat-field response, 519 

noise spectrum (FFRNS), 456 
flicker; 18 
rate, 18 
reduction, 98 
sensitivity, 19 
See also CRT displays 
flip() function, 415 
flocculate, 111 
fluorescence, 715, 872 
defined, 874 
efficiency, 874 
light spectrum, 788 
modeling, 874 
scattering term, 875 
See also phosphorescence 
fluorine, 687 
flux, 591,614-618 
in 3D, 614 
absorbed, 623 
approximation, 659 
arriving, 652 
components of, 588 
cosine term in, 616 
defined, 583, 652 
derivation, 618 
equilibrium and, 626 
explicit, 593-595, 791 
falling on patch, 658 
falling on surface, 617 
implicit, 595-596, 791 
incident, 631, 662 
input, 620 

inscattered, 625, 643 
leaving, 652 
left-moving, 587-588 
linear properties, 617 
magnitude, 615 
outscattered, 624 
particle flow and, 618 
percentage of, 623 
radiance vs., 872 
radiant, 652 


radiant, density, 708 
ratio of, 652, 653 
reflected, 662, 668 
right-moving, 587 
solution, 629 
from streaming, 623 
surface, 592 
time derivative, 596 
traveling through space, 637 
as vector quantity, 616 
volumetric emission, 623 
See also rod model 
flux-radiance relations, 655 
focal points, 1013 
primary, 1013 
secondary, 1013 
See also lenses 

folded radical-inverse function, 311 
forcing function, 636 
form factor integrals, 919 
form factor matrix, 897, 907 
illustrated, 908 
form factors, 894, 916-937 
analytic, 1113-1134 
analytic methods and, 916-919 
calculation methods, 937, 938 
computation of, 916 
contour integration and, 919-921 
defined, 916 
delta, 925, 934 
discussion, 937 

Eckert setup for measuring, 923 
energy ratio, 917 

estimating between two patches, 936 
estimation methods, 983 
extended, 979 
finite-to-differential, 934 
Galerkin method and, 937 
geometry, 917 
image synthesis and, 916 
library of, 925, 929 
matrix of, 944, 946, 966 
measuring, 921-924 
multipoint, 979 
patches, 942 

physical devices and, 921-924 
polygon, 920-921 
projection, 925-937 
hemicubes, 925-929 
line densities, 936-937 
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projection ( continued ) 
line distributions, 934 
ray tracing, 934-936 
surfaces, 929-934 
ray-traced, 936 
reciprocity of, 895, 896 
reciprocity rules and, 919 
in specular environments, 984 
surface-to-surface, 981 
surface-to-volume, 981 
volume-to-volume, 981 
See also radiosity 
formulas. See equations 
fourth dimension, 1077-1078 
forward Fourier transform, 200 
forward ray tracing, 1023 
forward scattering. See inscatter 
Fourier basis functions, 243 
Fourier domain 

dilation equation in, 285 
wavelets in, 285-291 
Fourier pairs, 200, 214 
CT, 231, 232 

discrete-time Fourier series, 224 
of Gaussians with cutoff points, 360 
reference of, 231 
See also Fourier transforms 
Fourier series, 192-197 
for box signal, 203-204 
discontinuity and, 195 
discrete approximation, 225 
discrete-time, 222-225 
examples, 203-213 
of periodic signal, 204 
synthesis equation for, 198, 202 
time-shifting property, 231 
Fourier series coefficients, 193 
for aperiodic signal, 199 
for continuous function, 198 
impulse train, 211-212 
for periodic signal, 197 
Fourier series expansion, 192-193 
defined, 192 

relations definition foi; 193 
Fourier transforms, 148, 173-240 
2D, 234-239 
amplitude of, 210 
analysis equation, 200, 210, 224 
of aperiodic box, 206 
of aperiodic function, 204 


of autocorrelation, 382 
basis functions, 175-186 
best candidate algorithm and, 432, 433 
for box signals, 204-206 
of box spectrum, 207, 342 
of continuous signal, 195 
continuous-time, 197-203 
continuous-time Fourier representations and, 
191-192 

of convolution, 216 
defined, 174 
definition of, 200 
differentiation property of, 399 
discrete properties, 233 
of discrete signal, 349 
discrete-time, 225-229, 231 
duality, 213-214 
essence of, 191 
examples, 203-213 
fast (FFT), 240 
forward, 200 

frequency content and, 244 
function of, 174 
of Gaussian, 208-209 
high-ordei; 239 
history of, 191 
of impulse signal, 210-211 
impulse train, 212 
inverse, 200, 201, 202, 352 
magnitude of, 427, 430 
N-dimensional, 239 
ParsevaPs theorem and, 203 
of periodic signals, 201-202 
properties of, 217 

representation of bases in lower dimension, 
186-191 

short-term, 244, 246-252 
of signals, 123 
ID, 124 
2D, 125 

spectrum of, 201 
summary of, 230 

synthesis equation, 193,198, 200, 202, 206, 
224, 226 
table, 221 

of wavelet functions, 262 
of wavelets, 255 
See also Fourier pairs 



INDEX 


1-1 7 


four-level refinement test, 468-470 
Argence test, 470 
criteria, 469 
for small objects, 469 
“small object test,” 470 
fovea, 9 

cone density in, 10 
fixation point on, 34 
rod-free zone, 11 
frames, reference, 175 
Fredholm equation, 839 

Fredholm integral of the second kind, 794, 816, 
837 

free boundary condition, 633 
free interval, 136 
free term, 793 
F refinement, 963 
frequency, 142, 252 
cutoff, 337, 342, 363 
ellipsoid foi^ 710 
functions, 1100 

global estimation of content, 409 
of light beam, 549 
natural, 569 
Nyquist, 340, 361, 362 
resampling, 429 
resolution, 289 
sampling, 339, 344, 361 
of wavelength, 1021 
frequency domain, 286 
convolution in, 337, 383 
multiplication in, 341 
frequency response, 164,171, 214 
of 2D systems, 168-169 
of band-pass filter, 220 
of high-pass filter 219 
of ideal low-pass filter, 219 
of LTI system, 215 
frequency space, 174, 364 
ID box filtering in, 357 
advantage of, 174 
box, 505 

equivalent of convolution, 219 
operator, 505, 506 
properties, 516 
sine function in, 358 
Fresnel coefficients, 735 
Fresnel reflection, 734 

for air-glass boundary, 735 
for unpolarized light, 736 


Fresnel’s formulas, 732-737 
deriving, 732 
for unpolarized light, 737 
Fresnel’s laws, 877 
Fubini theorem, 804-805 
illustrated, 804 
symbolic statement of, 805 
using, 807 

Full Radiance Equation (FRE), 875-876 
defined, 876 
operator notation, 876 
functional techniques, 499 
functions, 820 

2D projection, 188, 189 
absolute integrable, 194 
absolutely summable, 227 
analysis of, 183 
anisotropic, 764 
autocorrelation, 402, 403 
bar chart, 177-178 

basis, 175-186, 188, 190, 238, 251, 814 
Boolean, 415 

box, 153-154, 256, 364, 894 
BSSRDF, 664 
centroid of, 399 
characteristic, 794 
CIE color-matching, 1170 
complex, 139 
constant, 764 
Comette, 767 

cumulative distribution, 1099 
daylight, 1176-1177 
delta, 166, 384, 874, 875 
density, 322, 325,1099, 1100 
dilating, 253 

dilation equation and, 255 

Dirac delta, 148, 150, 151 

distribution, 306, 307,447, 737, 1098-1099 

domain, 135 

driving, 636, 792, 966 

energy in, 194 

error, 801 

filter, 395, 397 

folded radical-inverse, 311 

forcing, 636 

frequency, 1100 

gain, 639 

Gamma, 306 

Gaussian, 208-209 

Green’s, 852, 853, 860 
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functions ( continued) 
hat, 258,259 

Henyey-Greenstein, 761-762, 767-768 
Hermite, 139 
illumination, 864 
image, 967 

importance, 321, 322, 841, 848-864, 
964-966 

infinite discontinuity, 194-195 
input, 636 

integral equation solutions and, 817 

integrating, 606 

integrating factor; 640 

joint distribution, 1099 

Lambert, 764 

mapping, 500 

masking, 890 

Mie, 760-761, 766 

moments of, 260-261, 399 

nearest-surface, 641 

nonflat, 445 

Nystrom, 815 

observer matching, 1169 

oracle, 956, 961 

orthogonal, 179-182 

phase, 758-764, 765 

phase space density, 614 

phosphorescence efficiency, 873 

potential, 856 

potential value, 856 

projection of, 176-179 

radical-inverse, 311 

ramp, 149, 150 

Rayleigh, 760 

ray-tracing, 641 

real, 139 

refinement, 377 

reflectance, 662, 724 

reflection, 334-336, 670, 730 

residual, 801 

sampling, 361 

scaling, 244, 253, 279, 282, 284,1062 

scattering, 758-764 

Schlick, 762-763 

separable, 234 

shifting, 253 

sine, 155, 204, 341, 355, 358, 365, 384, 387 
spectral efficiency, 1170 
square, 253 

square integrable, 194, 1090 


surface emission, 634 
surface-scattering, 634 
surface-scattering distribution, 634 
synthesis of, 183 
unit step, 149 
value, 856 

visibility, 993, 1000, 1001 
visibility-test, 890 
visible-surface, 641, 642 
volume inscattering probability, 625 
volume outscattering probability, 624 
Walsh, 170 
warped, 500, 502 
wavelet, 260-263 
weight, 809,1089 
window, 246-247 
See also signals 
function spaces, 1090-1091 
function vector, 810 
fuzzy logic, 1103 

G 

Gabor transform, 247 
gain function, 639 
Galerkin bases, 983 
Galerkin matrix elements, 893 
Galerkin method, 833-836, 891 
3D, 900 

classical radiosity and, 69^693 
form factors and, 937 
iterated, 836 

projection operators in, 835-836 
Galerkin radiosity, 900 
gamma correction, 100 

coordinate shift due to, 101 
Gamma function, 306 
gamut, 1069 
defined, 106 
monitor, 106 
printer, 111 

gamut mapping, 106-111, 1069 
defined, 106 
difficulty of, 106 
global, 107-110 
local, 107,110 
methods, 106-107 
rendering information and. 111 
gathering radiosity, 953 
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Gaussian bump, 114, 345, 359, 365 
convolution of, 171 
Fourier pair of, 360 
illustrated, 209 
limit of, 195 
shifted downward, 524 
Gaussian distribution, 753 
Gaussian filter, 359 
Gaussian function, 208-209 
area under; 208 

complex exponentials and, 209 
defined, 208 

Fourier transform of, 208-209 
standard deviation, 208 
variance, 208 
Gaussian window, 249 
Gauss-Seidel + Jacobi iteration, 914, 915 
Gauss-Seidel iteration, 901 

classical radiosity and, 907-909 
defined, 903 
element updating, 904 
gathering step, 910 
illustrated, 904 
performance, 915 
Gauss’s theorem, 627 
generalized discrepancy, 456 
generating wavelet. See mother wavelet 
generations, 422 

number of samples in, 422 
geometrical optics, 562 
advantage of, 563 
geometric models, 784 
geometric series, 1103 
geometry, 76, 82-84, 412 
for analyzing bias, 378 
BRDF, 666 

contour integration, 921 
form factors, 917 
of full moon, 729 
hierarchy of, 782 
HTSG shading model, 745 
for imaging by thin lens, 1016 
level of detail problem, 781 
new sampling, 411 
node, 963 
nonuniform, 490 
OVTIGRE, 880 
patch, 654 
of plane waves, 551 
“predictable,” 480 


refinement, 480-481 
refinement test, 411 
reflection, 663 
sample selection, 485, 488 
sampling, 412 
shading, 727, 1007 
of specular reflection, 573 
of specular transmission, 577 
Strauss shading model, 748 
of test pattern, 414 
TIGRE, 878 
VTIGRE, 879 

Ward shading model, 750-751 
of zones, 601 
See also patterns 
geometry term, 732 
Gibbs phenomenon, 197 
global cube, 980 
global gamut mapping, 107-110 
drawbacks, 108 
methods, 107-108 
See also gamut mapping 
global illumination algorithms, 885 
global illumination models, 725 
gloss, 564-565 

absence of bloom, 566 
contrast, 566 
distinctness of image, 566 
illustrated, 565 
of paint, 730 
sheen, 566 
specular 566 
types of, 564 
See also surfaces 
goniometer 740 

set up to measure isotropy, 742 
goniometric configurations, 1140 
Gouraud interpolation, 976 
Gouraud shading, 59, 66, 976 
grains, 740-741 
size of, 173 

Gram-Schmidt orthogonalization, 183, 184 
graphic equalizers, 216 
grating, 24 
graybody, 715 
gray equation, 877 
Green’s function, 852, 860 
using, 853 
groups, 684 
group velocity, 568 
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guard band, 75 
guillotine shutters, 1051 

H 

Haar basis, 245, 260 
See also wavelets 

Haar wavelets, 252, 257, 258-262 
building up, 257 
coefficients for, 280 
defined, 258 
matrix, 271 
mother 260 

operator input average, 271 
space combinations and, 282, 283 
transform, 263 

zero-order matching properties, 278 
See also wavelets 
halo, 1062 

Halton sequence, 311 
Hammersley sequence, 311 
Hanrahan-Krueger multiple-layer model, 
778-780 
BRDF in, 779 
hat functions, 258 
illustrated, 259 

hazy Mie approximation, 760-761 
head-motion parallax, 41-42 
heat transfer, 982-983 
Heckbert’s algorithm, 1044 
Heisenberg uncertainty principle, 398-399 
deriving, 399 
testing, 402 

for time and energy, 402 
unitless form, 401 
helium, 685-686 
diatomic, 700 

Helmholz reciprocity rule, 666, 667 
hemicube algorithm, 926, 929 
hemicube method, 925-929, 983 
assumptions, 927-929, 930 
aliasing violation, 930 
proximity violation, 928 
violations of, 927 
visibility violation, 929 
benefit of, 925-926, 927 
distribution pattern, 928 
over differential path, 926 
hemispheres 

precomputed set of, 754 
as projection algorithms, 831 


sampled, 753-756 
subdivision, 932 
hemispherical direction sets, 608 
combinations of, 609, 611 
combining, 610 
interpretations of, 610-612 
orientations of, 612 
surface points with, 631 
See also direction sets 

Henyey-Greenstein phase function, 761-762 
defined, 761 

experiential data foi; 767 
plotted, 762 

Schlick function vs., 763 
two-term (TTHG), 762, 768 
Hermite function, 139 
Hermite interpolation, 976 
Hero’s principle, 1106 
hexagonal jittering, 441-443 
example of, 444 
pseudocode, 443 
See also jittered patterns 
hexagonal lattice, 417-420 
code foi^ 417 
defined, 417 
density of, 418-420 
drawbacks, 420 
illustrated, 419, 423 
isotropic nature of, 420 
jittered, 426,428, 444 
qualities, 418 
subdivided, 420-424 
hexagonal sampling, 419 
hidden-surface removal techniques, 37 
hierarchical integration, 495 
adaptive, 495 

hierarchical radiosity (HR), 900, 937-974 
children, 944 
importance HR, 964-974 
internal nodes, 944 
leaves, 944 

links, 948, 950, 951, 953-955 
link structure foi;. 954 
matrix elements needed by, 949 
node structure for, 953 
one step, 954 
pseudocode, 954, 955 
root, 944 
simple (SHE), 954 
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hierarchical radiosity (HR) { continued ) 
summary overview of, 961, 962 
See also nodes; radiosity 
hierarchical radiosity (HR) algorithm, 939, 951, 
974 

importance-driven, 966 
multiresolution analysis and, 964 
physical intuition of, 939 
wavelet bases to, 964 
See also hierarchical radiosity (HR); 
radiosity algorithms 
hierarchical refinement algorithm, 942 
hierarchy of detail, 781 
illustrated, 782 
hierarchy of scale, 781-785 
microscale, 785 
milliscale, 785 
object scale, 785 
high-frequency foldover, 381 
high-frequency noise, 373, 381 
high-order radiosity, 899-900 
problem with, 900 
See also radiosity 
high-pass filters, 219 
coefficients, 270 
operators as, 267 
See also filters 

Hilbert space, 819, 1089, 794 
homogeneous signals, 479 
horizontal retrace, 97 
horizontal sweep, 97 
HSL (hue, saturation, lightness), 68 
illustrated, 67 

HSV (hue, saturation, value) hexcone, 68 
illustrated, 67 

HTSG shading model, 744-747 
BRDF in, 744 
computing, 747 
geometry, 745 
reflection types and, 744 
See also shading; shading models 
hue-angle, 65 
human data, 1169-1172 
human eye, 5-14 
ciliary body, 8 

cones, 9, 15,16, 19-20, 21, 22, 24 

converging, 34 

cornea, 6-8 

crystalline lens, 8-9 

depth perception and, 33 


eccentricity, 11-13 
emmetropic, 13 
Gullstrand’s schematic, 7, 8 
hyperopic, 14 
iris, 8 

myopic, 13 
nodal point, 6 
optical power; 11 
physiology, 6 
pupil, 8 

retina, 9, 15, 55 
rods, 9,11,15,21,22, 24 
shape variation, 11-13 
structure and optics, 6-14 
transmissive characteristics of, 22, 23 
human visual system, 3, 5-55,174 
color opponency, 42-44 
components, 5 
depth perception, 33-42 
illusions, 51-54 

perceptual color matching, 44-51 
spectral and temporal aspects of, 14-23 
understanding of, 4 
visual phenomena, 23-33 
hybrid algorithms, 886, 1044-1049 
approximation computation, 1046 
general path of, 1048 
light transport paths and, 1045 
radiosity first pass and, 1045 
three-pass method, 1047 
variations of, 1049 
hybrid orbitals, 701 
contour map, 702 
electron-density contour map, 703 
hydrogen, 685 
atom, 688, 701 
hyperbolic trig function, 776 
hyperopic eyes, 14 
hypertexture method, 780-781 

I 

iconic bonds, 695—696 
identity matrix, 388 
identity operator, 270, 796 
IES (Illumination Engineering Society of North 
America), 649, 678 
IES standard, 1143-1152 
example file in, 1153 
keywords, 1145, 1147 
lamp-to-luminaire geometry, 1147, 1148 
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IES standard ( continued) 

main block, 147, 1143, 1145 
photometry block, 1143,1145, 1149-1152 
tilt block, 1145-1149 
See also CIE standard 
ignorance singularity method, 865 
IIR sine function, 359 

See also infinite impulse response (IIR) filters 
illumination 

background, 727 
calculating, 920, 1002 
direct, 723, 1002-1007 
estimation of, 1007 
function, 864 
global, 885 
models, 725 

indirect, 723, 1002-1007, 1034, 1039 

local, models, 725 

map, 1039 

painting, 1066-1067 

See also light 

Illumination Engineering Society of North 
America. See IES 
illumination sphere, 723-724 
sampling, 725 
unsampled, 726—727 
illusions. See optical illusions 
image-based processing, 1061-1063 
image box filtering, 358 
in frequency space, 357 
in signal space, 356 
See also filtering; filters 
image function, 967 
image plane, 881 
virtual, 881 

image-plane sampling, 395 
image surface, 880-881 
hypothetical, 881 
image synthesis, 4 
3D, 408, 1076 
algorithms, 792 
colored dots and, 117 
emphasis of, 1076 
form factors and, 916 
goal of, 880 
inner product and, 860 
light and material and, 543 
positive force of, 1081 
See also synthetic images 
imaging models, 1013 


immediate payoff, 858, 860 
per particle, 861 

implicit boundary condition, 634-635 
illustrated, 633 
implicit expressions, 595 
implicit flux, 595-596, 791 
implicit sampling, 881, 882 
importance-driven refinement, 974 
importance function, 321, 841 

defined on same domain as unknown 
function, 964-966 
illustrated, 322 
important domains and, 859 
Monte Carlo estimation and, 848-864 
importance-gathering element, 970 
importance hierarchical radiosity, 964-974 
attaching importance and, 974 
importance determination, 974 
importance distribution, 968, 970 
radiosity distribution, 968 
shoot importance, 971 
See also hierarchical radiosity (HR) 
importance sampling, 320-325, 392-398, 840, 
990 

by dividing filters into regions, 446 

defined, 320, 388, 392, 444, 841 
developing, 321 
disadvantages, 446 
distribution function, 447 
estimand, 328 
estimator; 328 
function of, 321,444-445 
for generating photons, 1039 
implementing, 395, 446 
importance function and, 841 
integral equations and, 841 
Monte Carlo, 861 
multiple-scale patterns, 448 
patterns, 443-448 
propagating, 849 
rendering situations and, 849 
stratified sampling and, 395-398 
of variable-scale patterns, 449 
zero-variance estimation, 320-321 
See also importance function; sampling 
importance-shooting element, 970 
important radiosity, 968 
impulse process, 381 
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impulse response, 156 
2D, 168 

analytic expression for, 218 
arbitrary system, 168 
finding, 218 
with finite support, 169 
illustrated, 157, 158 
reversed copy, 158 
using, 157 
See also convolution 
impulse signals, 148-153 
2D, 165 
continuous, 151 
defined, 148 
discrete, 151 

Fourier transform of, 210-211 
scalogram for, 290 
sifting property of, 156 
spectrogram for, 290 
unit step as, 149 
impulse train, 154, 354 
2D, 247 

Fourier series coefficients, 211-212 
Fourier transform, 212 
frequency-space of convolution with, 219 
illustrated, 211 
jittered, 384 
supersampling, 362 
incident light 

description of, 336 
evaluating at points, 334-336 
incomplete block sampling, 1033 
illustrated, 1034 
unstructured, 1033, 1034 
See also sampling 

index of refraction, 567-572, 1164-1165 
for aluminum, silver, 1166, 1167 
Cauchy’s formula, 570-572 
constant, 713-714 
for copper, gold, 1166, 1168 
as function of wavelength, 568 
at normal incidence, 1165 
Sellmeier’s formula, 569-570 
simple, 567 
Strauss model, 748 
in visible band, 571 
See also refraction 
indirect contribution 
estimating, 1031 
sampling, 1032 


indirect illumination, 723, 1002-1007 
direct illumination vs., 1006 
estimating, 1039 

gathering, through distribution ray tracing, 
1034 

illustrated, 1005 
See also illumination 
indirect parameters, 450 
distribution of, 454 
multiple, 453 
See also parameters 
indirect strata, 1006 
overlap of, 1007 
See also strata 
infimum (inf), 1087 

infinite impulse response (HR) filters, 220, 359 
infinite shelf, 897 

environment results, 899 
infinite support, 243 
infix operator, 156, 165 
information block, CIE, 1154-1155 
information theory, 1075 
informed Monte Carlo, 319-326 
antithetic variates, 326, 328 
control variates, 325-326, 328 
defined, 319 

importance sampling, 320-325, 328 
stratified sampling, 319-320, 328 
See also Monte Carlo methods 
informed sampling, 388 
informed stratified sampling, 319-320, 329 
estimand, 328 
estimator, 328 
inhomogeneous signals, 479 
initial sampling patterns, 409-411, 415-462, 
492 

creation approaches, 415 
density of, 409 
diamond lattice, 420 
discussion of, 455-462 
frequency content and, 409 
hexagonal lattice, 417-420 
importance sampling, 443-448 
jitter distribution, 426-427 
multidimensional, 448-455 
nonuniform sampling, 424 
N-rooks sampling, 424-426 
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initial sampling patterns (continued) 
Poisson-disk, 427 
dynamic, 440-443 
multiple-scale, 430-437 
precomputed, 427-430 
Poisson sampling, 424 
sampling tiles, 437-440 
square lattices, 420-424 
subdivided hexagonal lattices, 420-424 
subdivision, 487 
triangular lattice, 420 
uniform sampling, 415-417 
See also sampling 
inner products, 1088-1090 
of continuous functions, 1089 
defined, 1088 

norms as function of, 1089 
requirements, 1089 
inorganic molecules, 701 
input function, 636 
inscattering, 593, 625-626 
defined, 625 
explicit flux, 594 
flux, 625 
illustrated, 625 

volume, probability function, 625 
See also outscattering; scattering 
integers, 136 

integral equations, 543, 635, 791-869 
ID, 818 

characteristic functions, 794 
characteristic values of, 794 
classes of, 793 
examples, 794 
common feature of, 817 
converting into, 636 
defined, 636, 792 
degenerate kernels and, 801-804 
driving function, 793 
function of, 791 
general form of, 792 
homogeneity, 793 
importance sampling and, 841 
kind, 793 
linearity, 793 

methods for solving, 791-792 
Monte Carlo estimation, 840-864 
name, 793 
nonsingular; 794 
notation, 792 


numerical approximations and, 808-817 
operation effect on, 822 
operators, 795-798 
identity, 796 
kernel integral, 796 
norms, 798 

projection methods, 817-839 
regular 864 
singular, 864, 869 
singularities, 793, 864-868 
solutions to, 798-801 
functions for, 817 
methods, 799 

in state transition form, 844 
symbolic methods for, 804-808 
types of, 792-795 
integral form, 635-643 

of transport equation, 637-643 
integrating factor, 640 
integration, 140 

circular regions of, 505 
double, 348 
hierarchical, 495 
kernel of, 793 
methods, 280 
Monte Carlo, 299-329 
numerical, 300 
over solid angles, 605-606 
rule, 280 

integro-differential equation, 635 

transforming to integral equation, 636 
with unknown function, 636 
intensities, 653 
of a points, 90 
of 0 points, 90 
of 7 points, 90 
computing, 89 
different, 465 
sample, 464-465 
similar, 465 
white field, 96 

intensity comparison refinement test, 465-467 
intensity difference, 466 
intensity groups, 466-467 
summary of, 474 
See also refinement tests 
intensity statistics refinement test, 473-480 
confidence test, 476-479 
sequential analysis test, 479-480 
SNR test, 474 
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intensity statistics refinement test (continued) 
summary of, 480 
t test, 479 

variance test, 475-476 
See also refinement tests 
interactions 

clustering, 939 

between different types of nodes, 945 
interface, 574, 577 
interference, 549 
constructive, 549 
destructive, 549 
effects of, 563 
fringes, 549 

See also diffraction; light 
interlaced monitor, 98-99 
internal variance, 497 
intemuclear potential energy, 696 
curves for hydrogen, 697 
interpolation, 497-537 
formula, 342 
Gouraud, 976 
Hermite, 976 
linear hardware, 983 
spline, 523 
interposition cue, 37 
interreflection, 979 
intersection 

photon-surface, 1039 
point, 1021 

ray-object, 1023,1026, 1031 
of rays, 1023,1029 
ray-sphere, 1023-1024 
interspot 

distance, 78 
spacing, 78, 80 
apparent, 97 
contrast and, 89 
See also display spot interaction 
intervals, 136-137 
analysis of, 372-373 
confidence, 476 
sifting property foi; 152 
subrods in, 592 
time, 451 

invariant embedding, 644 
inverse Fourier transform, 200, 201, 202 
to central square, 352 
of spectrum, 206,229 
ionization continuum, 687 


ions, 683 
iris, 8 

irradiance, 652 
definition, 654 
incident, 872 

irradiance-radiance relation, 655 
isolux contours, 975 
isosceles triangular lattice, 420 
isosceles triangular subdivision, 485 
isotope, 683 

isotropic materials, 629-630, 664 
absorption coefficient, 630 
illustrated, 629 

incident/scattered directions and, 629 
outscattering coefficient, 630 
See also materials 
isotropy, measuring, 742 
iterated Galerkin method, 836 
iterated kernel of ordei; 807 
iteration, 503-506 
methods, 505-506 
iterative methods, 901 

Gauss-Seidel iteration, 903-904 
Jacobi iteration, 903 
overrelaxation, 905-906 
Southwell iteration, 904-905 
types of, 901 

J 

Jacobi iteration, 901 
defined, 903 
illustrated, 903 
radiosity and, 907 
use of, 907 
Jacobi loop, 903 
jitter distribution, 426-427 
pseudocode, 428 

jittered hexagonal lattice, 426,428 
illustrated, 444 
jittered impulse train, 384 
jittered patterns, 459-462 
2D discrepancies, 458 
pixel errors, 452 
jittered sampling, 384, 385 
jittering, hexagonal, 441-443 
joint distribution function, 1099 
joint distributions, 455 
Jones matrices, 558 
defined, 559 
examples of, 560 
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Jones vectors, 558 
defined, 559 
joules, 651 

just noticeable difference (jnd), 24 

K 

Kajiya shading model, 741-742 
Kantorovich’s method, 839 
kernel approximation methods, 799 
kernel integral operators, 796 
kernels, 793 
adjoint, 798 

of approximate operator, 808 
arrow notation in, 843 
bounce probability and, 851 
degenerate, 801-804 
discontinuous, 837 
factorizing, 867 
iterated, of order, 807 
pole of ordei; 866 
product, 867 
resolvent, 806 
separable, 234, 801 
weighted, 863 
kets, 143-144 
defined, 144 
into numbers, 144 
Kirchhoff’s law, 708 
Kirchoff approximation, 742 
Kirchoff diffraction theory, 741 
Kubelka-Munk differential equations, 774-776 
Kubelka-Munk pigment model, 770-778 
solutions, 777-778 
theory, 772 

See also paint; pigments 

L 

L*a*b* color space, 63-66 

recovering XYZ coordinates of color from, 
64-65 

sketch of, 64 

L*u*v* color space, 59-66 

recovering XYZ coordinates of color from, 
64-65 

sketch of, 63 

XYZ color space conversion, 61 
Lagrange multipliers, 323, 707 
Lagrange polynomials, 828 
illustrated, 829 
Lambert function, 764 


Lambert shading model, 726-731 
See also shading; shading models 
lamp-to-luminaire geometry, 1147 
choices foi; 1148 
lateral inhibition, 29 
lateral separation, 37 
lattice, 415 
2D, 415 
densities, 423 
diamond, 420, 421 
hexagonal, 417-424, 426, 428 
points, 418 

rectangular, 416,417,427 
regular, 417 
sampling, 417 
square, 420-424 
triangular, 415, 420, 421 
law of reflection, 1110 
law of refraction, 574, 1111 
LCAO-MO (linear combination of atomic 
orbitals-molecular orbital), 696, 698 
power of, 701 
LCD panels, 71 
L cones, 16,17 

least-square projection method, 831-832 
LED displays, 71, 345 
Legendre polynomials, 675-676 
lenses 

convex-convex, 1013, 1015 
double-convex, 1013 
focal points, 1013 
shutters, 1051 
thick, 103 
thin, 1013, 1015 

See also crystalline lens; focal points 
light, 545-578 

ambient, 725, 727, 738 
arriving at Earth, 613 
atmospheric distribution of, 766 
behavior of, 545 
circularly polarized, 556 
cones of, from a point, 1018 
direct, 1004 

double-slit experiment, 545-549 
elliptically polarized, 554-556 
fluorescent spectrum of, 788 
frequency, 549 
incident 

description of, 336 
evaluating, 334-336 
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light (continued) 

from incident direction, 335-336 
index of refraction, 567-572 
indirect, 726, 1004-1006 
linearly polarized, 556 
luminescent, 704-705 
material and, 543 
mixing, 68 

particle-wave duality, 563 
paths, 1048-1049 
peripheral, 1067,1068 
photoelectric effect, 560-562 
polarization, 549, 554-560, 735 
propagation of, 408 
propagation speed, 549 
reflection, 563-567 
sources, 1172-1177 
striking a point, 335 
striking particles, 612 
thermal, 704 

transmission, 563-567, 663 
unpolarized, 558, 737 
wavelength, 549 
wave nature of, 549-554 
lightness constancy, 32-33 
defined, 32 

lightness contrast, 31-32 
defined, 31 
example, 31-32 
illustrated, 31 
light pass, 1047 

light transport equation, 643-644 
limit errors, 107 
preventing, 111 
linear algebra, 1085-1091 
linear bisection, 481-485 
linear estimators, 303 
minimum-variance, 303 
linear index of refraction, 714 
linearity, 134, 145 
linearly birefringent, 557 
linearly dichroic, 557 
linear perspective, 40 
linear processing, 1063-1064 
linear spaces, 1085-1090 
See also spaces 
linear systems, 132 
2D, 165-166 
complex-linear 134 
definition of, 134 


real-linear 134 
time-invariant, 132-135 
See also systems 

line density projection method, 936-937 
line distributions, 934 
links, 948 

building set of, 954-955 
creation order 950 
defined, 953 
for linear patches, 951 
refinement test for 963 
See also hierarchical radiosity (HR) 
link structure, 954 
lithium, 686 
local bandwidth, 376 
establishing, 376 
estimation of, 463 
local filtering, 517-521 
criteria, 517, 518 
flat-field response, 519 
noise sensitivity, 520-521 
normalization, 519-520 
See also filtering; filters 
local gamut mapping, 107, 110 
clipped profile, 107, 108 
illustrated, 108 
methods, 107 
See also gamut mapping 
local illumination models, 725 
locally parallel, 742 
local sampling rate, 376 
log-linear scale, 478 
long-persistence phosphor 72 
low-pass filters, 120 
choosing, 123 
ideal, 219 
illustrated, 364 
operators as, 267 
reconstruction, 363 
wavelets and, 263 
See also filters 
LTI systems 
2D, 168 

complex exponentials and, 164 
eigenfunctions of, 163-164 
frequency response of, 164, 215 
lumen, 661 
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luminaire, 634, 723, 1005 
axis, 1140 
bright, 1004 
components, 1139 
defined, 1139 

goniometric configurations, 1140 
housing, 1139 
identifying, 1031 
illustrated, 889 
lamps, 1139 

illustrated, 1150 
nonrectangular, 1151 
output measurements, 1139-1140 
rectangular; 1151 
rotation, 1155, 1157 
shape codes, 1155, 1158 
spherical coordinates, 1140 

system illustration, 1141, 1142 
stratifying, 1031 
terminology, 1139-1143 
luminaire standards, 1139-1161 
CIE standard, 1152-1161 
IES standard, 1143-1152 
notation for, 1143 
luminance 

adapting to, 1060 
dynamic range of, 1054 
of everyday backgrounds, 20 
mapping, 1064 

measured in candelas per square meter, 1063 
ratio of maximum displayable, 1060 
world, 1063 
luminescence, 648 
conventional, 715 
types of, 719 

luminescent emission, 704-705, 716 
luminous efficiency curves, 21, 22 
luminous energy, 660 
transport of, 661 

M 

Mac Adam ellipses, 59 

in Farnsworth’s nonlinear transformation, 62 
illustrated, 60 

in perceptually linear space, 61 
Macbeth ColorChecker; 1179-1190 
chart, 786, 1170 

chromaticity coordinates for, 1179,1180 
colors, 1179 


Mach bands, 29-31 
defined, 29 
illustrated, 30 
neural analysis of, 31 
origin of, 29 
magnetic fields, 549 
illustrated, 553 

magnetic-moment quantum number 683-684 
magnification, 260 
main block, CIE, 1154-1155 
main block, IES, 143,1145 
malignant singularities, 865 
mapping 
bump, 780 
function, 500 
gamut, 106-111, 1069 
one-to-one, 502 
texel, 781 
texture, 755, 780 
marginal density, 1100 
marginal distribution, 1100 
masking 

defined, 732 
expressing, effects, 743 
function, 890 
illustrated, 733 
mass matrix, 827 
material data, 1164-1169 
material descriptors, 778 
materials, 543, 681-718 

atomic structure of, 682-690 
blackbodies, 705-708 

blackbody energy distribution and, 708-715 
isotropic, 629-630, 664 
layers of, 778 
light interacting with, 779 
localized transitions in, 715 
molecular structure of, 694 
particle statistics of, 690-694 
phosphorescent, 723 
phosphors and, 715-718 
radiation of, 704-705 
selective absorption, 770 
selective reflection, 770 
uniform, 664 
visual appearance, 730 
matrix 

blocks, 947 

collocation elements, 839 
constant block within, 944 



INDEX 


1-29 


matrix ( continued) 

finite approximate, operator 863 
of form factors, 897, 907, 908, 944, 946, 
966 

Galerkin elements, 893 
HR elements, 949 
identity, 388 
inversion, 223 
Jones, 558-560 
mass, 827 
notation, 966 
stiffness, 827 
wavelet transform of, 837 
matrix equation, 833 
errors 902 
general, 901 
residual, 902 
solving, 900-906 
Max’s filter, 527-529 

See also reconstruction filters 
Maxwell’s equations, 549 

for electromagnetic energy, 551-552 
list of, 552 
parameters, 552 
for plane waves, 552 
M cones, 16, 17 

mean-distance refinement test, 471-472 
distance criterion, 472 
filter criterion, 472 
illustrated, 471 
uniformity criterian, 472 
See also refinement tests 
mean squared error (MSE), 180-181 
defined in symbols, 181 
minimizing, 181-182 
mean value, 1100 

measurement block, CIE, 1155-1158 
measure theory, 639 
meshing, 888, 984 
defined, 974-975 
discontinuity, 975 
on ground plane, 1044 
problem of, 975 
techniques, 984 
metastable state, 715 
methane, 704 

method of iterated deferred correction, 839 
method of moments. See Galerkin method 
metrics, 1087-1088 
defined, 1087 


in terms of norms, 1088 
microfacets, 732 
illustrated, 733 
RMS slopes of, 737 
microscale, 785 
Mie phase function, 760-761 
approximate, 767 
defined, 760 
illustrated, 761 
Schlick function vs., 763 
Mie scattering, 760 
milliscale phenomena, 785 
minimal bounding spheres, 469 
minimax argument, 387 
minimax problem, 831 
minimum-distance constraint, 383, 429 
minimum-variance estimator 303 
linear 303 
mip-maps, 1035 

Mitchell and Netravali filter, 529-532 
See also reconstruction filters 
mixed reflection, 564 
mixed transmission, 566 
illustrated, 567 
modeling methods, 408 
modeling program, 1066 
modes, 710 
energy, 711 

molecular-orbital bonds, 696-704 
molecular orbitals, 699, 700 
construction of, 700-701 
molecular structures, 694-704 
molecules, 695 
inorganic, 701 
organic, 701 
moments, 260-261 
first two, 399 
method of, 833 
rth, 1101 
rth central, 1101 
vanishing, 261 
monitors, 97-100 

brightness control, 98, 99 
chromaticity diagram, 106 
colors displayed on, 100 
contrast control, 98, 99 
defined, 97 
flicker of, 98 
gamma correction, 100 
gamut, 106 
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monitors (continued) 
interlaced, 98 
noninterlaced, 97 
raster screen patterns, 98 
RGB, 103 

See also CRT displays; displays 
monocular depth, 37-41 
cues, 37 
defined, 37 
interposition, 37 
perspective, 39—41 
size, 38-39 
monomials, 811 

Monte Carlo estimation, 840-864 
importance function, 848-864 
path tracing, 444-448 
random walks, 442-444 
Monte Carlo importance sampling, 861 
Monte Carlo methods, 123, 299-329, 
1010-1011 

adaptive sampling, 327 
antithetic variates, 326, 328 
for attaching time to rays, 1021 
basic ideas, 300-305 
bias, 302 
blind, 307-319 

blind stratified sampling, 309-310, 328 

confidence in, 305-307 

control variates, 325-326, 328 

crude, 307-308, 328 

defined, 300 

estimand, 300 

estimand distribution, 302 

estimand erroi; 303, 304, 305, 312, 314, 328 

estimand variance, 302 

expected value, 302 

fundamental result of, 304 

importance sampling, 320-325, 328 

informed, 319-326 

informed stratified sampling, 319-320, 328 
integral equations and, 817 
multidimensional weighted, 315-319 
observation set, 301 
size, 301 

parent distribution, 302 
problems with, 305 
quasi, 310-312, 328 
rejection, 308-309, 328 
research, 305 
sample set, 300 


sample size, 300 
sampling distribution, 302 
sampling variance, 302 
summary of, 328 
weighted, 312-315, 328 
weighted averages, 301 
zero-variance, 320-321 
See also estimator 
Monte Carlo quadrature, 817 
moon illusion, 38-39 
defined, 38 
explanation, 38-39 
illustrated, 39 
mother wavelet, 244, 258 
motion, 412 
motion parallax, 41-42 
defined, 41 
head-motion, 41,42 
object- motion, 41 
usefulness of, 42 
Mueller matrices, 558 
Miiller-Lyer illusion, 52-53 
illustrated, 53 

multidimensional patterns, 448-455 
multidimensional reconstruction, 365 
multidimensional sampling, 365 
multidimensional subdivision method, 1031 
multidimensional weighted Monte Carlo, 
315-319 

nearest-neighbor approach, 315 
trapezoid approach, 315 
multigridding, 963, 982 
multipass algorithms. See hybrid algorithms 
multi-pass ray-tracing algorithm, 1039 
multiple-level sampling algorithm, 490-492 
cumulatively compatible templates and, 492 
defined, 490 
n-level strategy, 490 
two-level, 492 

multiple-scale Poisson-disk patterns, 430-437 
best candidate algorithm, 430-431 
building methods, 430 
decreasing radius algorithm, 432-435 
multiple-scale templates, 447 
refinement of, 497 
multiplets, 687 
multiplication 

convolution and, 123 
in frequency domain, 341 
linear systems and, 134 
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multiplication (continued) 
matrix, 245 
in one domain, 355 
property, 217 
scalar 1085 
in time domain, 383 
transform pair; 233 
vector; 245 

multipoint form factors, 979 
multiresolution analysis, 282-285, 297 
defined, 282 
framework for, 284 
HR algorithm and, 964 
properties of, 285 
See also resolution; wavelets 
multistage filters, 534-536 
defined, 534 
sampling rate and, 535 
spectrum, 535 
summary, 536 

See also reconstruction filters 
multistep reconstruction, 532-535 
spectrum, 535 

murky Mie approximation, 760-761 
myopic eyes, 13 

N 

natural frequency, 569 
nearest-neighbor approach, 315 
nearest-surface function, 641 
nearsighted, 13 
negative radiosity, 981 
neon, 687 
nested spaces, 284 
Neumann series, 806-808 
approximation, 1046 
defined, 806 
resolvent operator, 806 
neutral-density filter; 76 
neutrons, 682 

new sampling geometry, 411 
Newton-Cotes rules, 812, 869 
nitrogen, 687 
nodal plane, 698 
node refinement, 494 
nodes 

children of, 959, 960, 961 
data structure, 970 
defined, 953 
delayed linking of, 958 


emission field for; 953 
geometry of, 963 
for hierarchical radiosity, 953 
intermediate, 961 
linking, 955-956 
parent, 961 

root, 954, 958, 959, 961 
structures of, 953 

See also hierarchical radiosity (HR) 
noise, 28, 538 

aliasing and, 398-404 
artifacts, 374 
filter, sensitivity, 520-521 
high-frequency, 373, 381, 520 
shot, 520 
white, 402 

See also signal-to-noise ratio (SNR) 
noninterlaced monitor, 97-98 
nonlinear observer model, 1057-1061 
scenes processed by, 1061 
nonradiative transition, 718 
nonuniform cubic B-spline filter, 524 
nonuniform geometry, 490 
nonuniformity, 1067 
defined, 1068 

nonuniform reconstruction, 371, 404 
algorithms for; 404 
difficulties in, 404 
See also reconstruction 
nonuniform sampling, 369-404, 411-415 
ID Nonuniform Sampling Theorem, 
500-501 

2D Nonuniform Sampling Theorem, 503 

adaptive, 327, 371, 375, 376-381 

aperiodic, 373, 381-385 

defined, 369 

patterned, 424 

random, 411, 424 

recurrent, 522, 523 

types of, 375 

See also stochastic sampling; uniform 
sampling 

normal dispersion, 568 
normal distribution, 1102 
normalization, 519-520 
BRDF, 666-667 
norms, 1086-1087 

metrics defined in terms of, 1088 
RMS, 1086 

Tchebyshev, 430, 1087 
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notation, 135-138 
2D signal, 165 
for algebraic objects, 1086 
assignment and equality, 139-140 
braket notation, 143-146 
central-star reconstruction, 512 
CIE standard, 1153 
complex exponentials, 140-143 
complex numbers, 138-139 
FRE, 876 
integers, 136 
integral equation, 792 
intervals, 136-137 
linear algebra, 1085 
luminaire standard, 1143 
matrix, 966 

operator 265-267, 795, 876, 878, 879 
orbital, 698 

product spaces, 137-138 
radiometry, 648-649 
real numbers, 135 
solid angle, 603-605 
spaces, 146-148 
summation and integration, 140 
TIGRE, 878 
transport equation, 639 
VTIGRE, 879 
N-rooks sampling, 424-426 

d-dimensional form of, 451,453 
pattern, 425 
pseudocode, 426 
See also sampling 
nucleons, 682 
nucleus, 682 

number-theoretic Monte Carlo. See quasi Monte 
Carlo 

numerical approximations, 808-817 
Monte Carlo quadrature, 817 
numerical integration, 809-810 
Nystrom method, 814-816 
quadrature on expanded functions, 812-814 
undetermined coefficient method, 810-812 
numerical integration, 809-810 
Nusselt analog, 921-923 
defined, 921 
Eckert setup with, 923 
function of, 922 
illustrated, 922 
See also form factors 


Nyquist frequency, 340, 362 
for sampling density, 361 
Nyquist limit, 343-344, 371 
leaking into central copy, 343 
sampling frequency at, 344 
of sampling grid, 371 
Nyquist rate, 340, 372,403, 408 
of resampled pulses, 498 
Nystrom method, 814-817 
defined, 814 

O 

object-based refinement test, 468-472 
Cook’s test, 472 
four-level test, 468-470 
mean-distance test, 471-472 
object-count test, 470-471 
object-difference test, 468 
See also refinement tests 
object-motion parallax, 41 
objects 

4D, 1077-1078 
algebraic, 1086 
basis representations for, 176 
candidate list of, 1026 
convex, 601, 602 
halo around, 1062 
important, 974 
intersected, 998 

occupying same solid angle, 605 
radially projected, 604 
scale of, 785 

spectral information for, 1191-1206 
object scale, 785 
object tags, 1066 
observation set, 301 
size, 301 

occupancy equation, 694 
occupancy probability, 691 
occupation index, 706, 712 
ocean, 769-770 

two-layer model for; 769 
oculomotor depth, 34-35 
defined, 34 

on-demand patterns. See dynamic Poisson-disk 
patterns 

operation mapping, 176 
operator notation, 265-267, 795 
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operators, 132 
adjoint, 797, 861 
building, 267-274 
coarse-to-fine, 269 
composite, 1058 
convolution, 156 
del, 627 

display-adapted, 1057 
display device, 1059 
frequency-space, 505, 506 
as high-pass filters, 267 
identity, 270, 796 
infix, 156, 165 
integral equation, 795-798 
kernel integral, 796 
as low-pass filters, 267 
norms of, 798 

projection, 144, 812, 819, 820, 835 
resolution-changing, 838 
resolvent, 806 
restriction, 267 
self-adjoint, 798 
tone reproduction, 1058 
vision, 1056 
optical illusions, 51-54 

found by Roger Penrose, 54 
“impossible figures,** 55 
Miiller-Lyer, 52-53 
subjective contours, 52 
two inner circles, 53, 54 
optical path length (OPL), 1107 
optimal rules, 809-810 
optimization, 1066 

constrained least-squares, 1067 
oracle function, 956, 961 
orbitals, 684 

antibonding, 696 
bond, 701 
bonding, 696, 698 
combination of, 699 
hybrid, 701, 702, 703 
molecular, 699, 700 
notation convention foi; 698 
probability density plots foi; 688 
shapes, 696 

spherically symmetric, 687 
structure of, 687 

ordinary integro-differential equation, 635 


organic molecules, 701 
orthogonal constraint, 179 
normalized, 180 
orthogonal functions, 179-182 
complete, 180 
family of, 180 
outlyers, 520 
out of phase, 554 
illustrated, 555 
outscattering, 593, 624-625 
explicit flux, 595 
flux, 624 
illustrated, 624 
isotropic, 630 
methods, 624 

volume, probability function, 624 
See also inscattering; scattering 
overrelaxation, 901, 905-906 
defined, 906 

solution methods and, 913 
oversampling, 340 
overshooting, 915, 916 
OVTIGRE, 880, 890 
oxygen, 687 

P 

paint, 730, 770 
handling, 770 

horizontal slice of thickness within, 771 
on surfaces, 771 
reflectance of, 774 

scattering coefficients of, 775, 777-778 
spectra, 775 
types of, 731 
See also pigments 

Painter and Sloan’s method, 512-515 
advantage of, 513 
painter’s algorithm, 37 
paint programs, 1067 
parallel axis, 557 
parallel network, 162 
illustrated, 163 
parallelogram, 821 
parameterized shading models, 752 
parameters 

arguments of, 128 
direct, 450 

ellipsometric, 556, 558 
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parameters (continued) 
indirect, 450, 453, 454 
phenomenological material, 552 
Stokes, 558 

Strauss shading model, 748 
parent distribution, 302, 307 
parent node, 961 
Parseval relation, 287 

for wavelet transform, 286 
Parseval’s theorem, 203 

Fourier transform differentiation property 
and, 399 

particle distribution equations, 587-591 
particle history, 844 
stopping, 847 
particle phase space, 613 
particles, 613-614 
absorbed, 584, 845 
bounce, 849-850 
collision of, 584 
counting, 585-587 
density of, 583-584, 614 
distribution of, 587, 691 
flowing over surface element, 615 
flow of, 618 
net, 622, 623 
flux of, 583-584 
interaction of, 613 
left-moving, 586 
light striking, 612 
outscattering of, 624 
path history, 844 
properties of, 582 
right-moving, 585-586 
saturation of, 591 
scattered, 584 

space-time diagram, 586 
size of, 758 
speed, 613, 617 
state visits, 852 
statistics of, 690-694 
steady state flow, 595 
streaming, 594 
suspension of, 758 
transport in 3D, 596-618 
weighted, 847 
See also particle state 
particle state, 842 
absorption, 844 
birth, 844 


bounce off of, 846 
creation, 844 
first, 844 
initial, 844 

particle description by, 844 
path history, 844 
payoff, 858 
potential, 860 
state space, 842 
transfer from, 844 
See also particles 
particle-wave duality, 563 
pass band, 220-221 
patches, 888 
big, 942 
child, 969 
circular, 952 
differential, 653 
receiving, 651 
source, 656 
diffuse, 1049 
flux falling on, 658 
form factor, 942 
geometry, 654 
hierarchy, 944 
linear, 951 
links for, 951 
parent, 948, 969 
perpendicular, 656 
projected onto source point, 658 
rays striking, 977 
reflected power from, 909 
shooting, 936, 976 
source, 653 
spherical, 649-651 
subdivided, 936 
hierarchy of, 942 
undergoing refinement, 943 
visibility of, 925 

path tracing, 844-848,1012, 1036 
advantage of, 846-847 
defined, 846, 1012 
illustrated, 1012 

patterned nonuniform sampling, 424 
patterns 

for associating time intervals with spatial 
regions, 451 

dart-throwing, 459-462 
diamond, 489 
discrepancies for, 458 
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patterns ( continued) 
discrepancy of, 311 
equidistibuted, 311 
initial sampling, 409-411, 415-462 
jittered, 459-462 
judging, 455-456 
multidimensional, 448-455 
multiple-scale, 448, 497 
N-rooks sampling, 425 
Poisson-disk, 427-437 
regular; 412 
space-time, 452 
test, geometry, 414 
variable-scale, 449 
Yen’s study of, 522 
Zaremba, 459 
See also geometry; tiles 
Pauli exclusion principle, 685 
Pavicic filter; 523-524 

See also reconstruction filters 
payoff, 855 
eventual, 861 
immediate, 858, 860, 861 
from particle in state, 858 
potential, 860 
remaining, 855, 856 
total, 855, 860 
perceived brightness, 1058 
perception 

altering, 50-51 
depth, 33-42 

perfect diffuse reflection, 672 
perfect specular reflection, 673-675 
periodic boundary condition, 633 
periodic box, 204-206 
periodic signals, 130-132, 197 
analyzing, 201 
approximate, 199 
building, 225 
defined, 130 
formation of, 131 
Fourier series coefficients for, 197 
Fourier series of, 204 
Fourier transform of, 201-202 
input, 199, 201 
with interval, 131 
See also aperiodic signals; signals 
periodic table of elements, 1137 
periodic waves, 549 
periods, 684 


peripheral lighting, 1067, 1068 
perpendicular axis, 557 
perpendicular patch, 656 
persistence, 72 
perspective 
aerial, 40 
cue, 39-41 
forced, illusion, 40 
linear 40 
projection, 39 
texture gradient, 40, 41 
phase, 252 

phase functions, 758-764 
classes of, 759 
constant, 763 
criterion for selecting, 759 
defined, 758 

Henyey-Greenstein, 761-762 
isotropic nature, 758 
Lambert, 764 
Mie, 760-761 
Rayleigh, 760 
Schlick, 762-763 
simple anisotropic, 764 
summary of, 765 
phase space, 613 

density function, 614 
illustrated, 614 
phase velocity, 551 
Phong shading, 59 
equation, 726 
model, 726-731 

phosphorescence, 112, 648, 715, 872 
defined, 873 
efficiency function, 873 
modeling, 873 
power-law, 716 
See also fluorescence 
phosphorescent emission, 874 
phosphorescent materials, 723 
phosphorescent term, 873 
phosphors, 72-73, 705, 715-718 
arrangement of, 73 
beam spread and, 74 
chromaticities of, 103 
conductors as, 716 
in CRTs, 101 
defined, 72, 715, 770 
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phosphors (continued) 
geometry, 76 

/3-type points, 83 
7 -type points, 83 
patterns, 82-84 
triangular, 76, 77 
Hitachi monitor, 1178-1179 
intensities, 1178 
light emission, 99 
long-persistence, 72 
on-and-off, 82 
photons emitted by, 715 
radiance of, 716, 717 
reference data, 1177-1179 
shadow mask and, 73 
short-persistence, 72 
standard, 102 

coordinates for, 102 
See also display spot interaction; spots 
photoelectric effect, 560-562 
apparatus for observing, 561 
defined, 561 
photons and, 562 
photometric terms, 660 
list of, 650 

photometry, 647, 660-661 
defined, 660 

photometry block, CIE, 1159-1160 
photometry block, IES, 1143,1145, 1149-1152 
photons, 562 

absorbed, 1038-1039 

apparent mass of, 562 

Bose-Einstein distribution law for, 708 

emitted by phosphor, 715 

generating, 1039 

occupation index, 706 

points intersected, 1037 

striking surfaces, 1039 

See also electrons 

photon tracing, 988-989, 1037-1039 
defined, 988 
illustrated, 989 
machine produced by, 1038 
path generation and, 1042 
process of, 1037 

See also ray tracing; visibility tracing 
photopic spectral luminous efficiency, 660 
photopic vision, 19-20 
CSF foi; 24, 26 

luminous efficiency functions, 21, 22 


photopigment, 15 
photoreceptors, 9-10 
cells on top of, 17 
density change, 11 
density of, 10,117 
function of, 117 
in invertebrates, 18 
packing patterns, 55 
physical constants, 1136 
physical constraints, 1069 
physical devices, 921-924 
Farrell device, 923-924 
Nusselt analog, 921-923 
physically based rendering, 885 
physical optics, 549, 1074 

as image formation model, 563 
picture space, 1065 

piecewise-continuous reconstruction, 507-517 
Painter and Sloan’s method, 512-515 
thin-plate splines, 515-517 
for triangles, 514 
Whitted’s method, 507-509 
Wyvill and Sharp’s method, 509-512 
See also reconstruction 
piecewise cubic filter 535 
pigments, 730, 770 
defined, 770 

modeling limitation, 776-777 
See also paint 
pinhole camera, 1013 
illustrated, 1014 
pitch, 74 

pixel centers, 360 
resampling at, 483 
pixel errors, 462 
pixel grid 

sampling function and, 361 
supersampling impulse train and, 362 
pixel level, 496 
pixels, 407 

anti-aliasing in, 332-333 
average color of, 332 
clamped, 1062 
diamond lattice over, 422 
out-of-gamut, 1070 
ray tree for, 1071 
sampling, 361, 414,476 
sampling pattern through, 783 
split, 477 
target foi; 1070 
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Planck’s constant, 562 
Planck’s law, 713 

medium-dependent, 713 
Planck’s relation, 401 
plane waves, 550 
geometry of, 551 
Maxwell’s equations for, 552 
point-diffusion algorithm, 440-441 
example of, 442 
pseudocode, 442 
point driven strata sets, 996-999 
points 

collocation, 827 
directions around, 607 
direct light at, 1104 
intersection, 1021 
particle transport in 3D, 596-597 
quadrature, 809 
radiance leaving, 656 
surface, 993 
point samples, 119 

point-sampling approach, 333, 372, 412, 1036 
adaptive, 466 
See also sampling 
point set matrix, 387 
point sets, 386 
Poisson-disk criterion, 427 
Poisson-disk patterns, 427-437 
building by dart throwing, 429 
defined, 427 
dynamic, 440-443 

jittered hexagon approximation to, 443 
multiple-scale, 430-437 
precomputed, 427-430 
pseudocode, 429 
sampling tiles and, 437-440 
weighted, 449 
Poisson sampling, 424 
pseudocode, 425 
polarization, 554-560 

Torrance-Sparrow model, 739 
tracking of, 740 
polarized light, 558 
polygons 

clustering, 985 

form factor between, 1132-1134 
mesh beating, 931 
rendering systems, 408 
polynomial collocation, 825-830 
defined, 827 


equations, 829-830 
See also collocation 
polynomial interpolation, 828 
post-aliasing, 518 
postprocessing, 911,1054-1064 
defined, 1056-1057 
image-based processing, 1061-1063 
linear processing, 1063-1064 
methods, 1053, 1056 
nonlinear observation model, 1057-1061 
potential function, 856 
potential payoff, 860 
potential value function, 856 
Poulin-Foumier shading model, 742-743 
power 

shooting the, 909 
unshot, 911 
power-law decay, 716 
power SNR, 456 

power spectral density (PSD), 382, 384 
computing, 386 
defined, 382 
flat, 402 

single copy of, 383 
precomputed BRDF, 753-757 
advantages, 753 
lining up, 754 
sampled hemispheres, 753 
spherical harmonics, 756-757 
precomputed Poisson-disk patterns, 427-430 
primary focal points, 1013 
principle of detailed balancing, 692 
principle of reciprocity of transfer volume, 655 
principle of univariance, 17 
principle quantum number, 683, 685 
printers 

chromaticity diagram, 106 
gamut, 111 

probability, 1093-1103 
certain event, 1094 
conditional, 1095 
defined, 1095 
distributions, 1102-1103 
events and, 1093-1095 
experiment, 1093 
further reading, 1103 
geometric series and, 1103 
measures, 1101-1102 
random variables and, 1098-1101 
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probability ( continued ) 

repeated trials and, 1097-1098 
total, 1095-1097 
processes 

impulse, 381 
skip, 381 
weighted, 381 
processing 

image-based, 1061 
linear; 1063-1064 
See also postprocessing 
product spaces, 137-138 
Cartesian, 137 

progressive radiosity, 934, 963,1049 
See also radiosity 
progressive refinement, 911-913 
defined, 911 
See also refinement 
progressive refinement algorithm, 942 
projected areas, 597-598 
defined, 597 
illustrated, 598 
radiometry and, 649 
projected solid angles, 603 
radiometry and, 649 
projection methods, 799, 817-839, 925 
cubic-tetrahedral, 933-934 
discussion, 839 
essential points about, 818 
Galerkin, 833-836 
Kantorovich, 839 
least square, 831-833 
line density, 936-937 

method of iterated deferred correction, 839 
pictures of function space, 819-825 
polynomial collocation, 825-830 
ray tracing, 934-936 
single-plane, 931, 932 
Tchebyshev approximation, 819, 830-831 
wavelets, 837-838 
projection operators, 144, 819, 820 
in Galerkin method, 835-836 
orthographic, 819 
truncation, 812 
projections, 504 

alternating nonlinear, 504 
form factors, 925-937 
length of, 832 
vector, 832 

projection surface, 925 


projection techniques, 819 
propagation 

direction of, 550 
speed of, 549 
protons, 682 

absorption of, 690 
pseudocode 

best candidate algorithm, 431 
BuildLinks, 956 
decreasing radius algorithm, 434 
GatherRad, 959 
GatherRadShootlmp, 973 
hexagonal jittering, 443 
HR, calling dependence, 954, 955 
InitBs, 956 
jitter distribution, 428 
N-rooks sampling, 426 
OKtoKeepImpLink, 972 
OKtoKeepLink, 965 
OKtoLinkNodes, 957 
point-diffusion algorithm, 442 
Poisson-disk patterns, 429 
Poisson sampling, 425 
PushPullImp, 973 
PushPullRad, 960 
Refine, 957 
Ref ineLink, 965 
SolveAHR, 964 
SolveDual, 972 
SolveHR, 958 
SolvelmpHR, 970 
SolveSHR, 955 
pupil, 8 
pure DC, 210 
Purkinje shift, 21 
pyramid algorithm, 273 

O 

quadratic formula, 400 
quadrature 
methods, 799 
Monte Carlo, 817 
on expanded functions, 812-814 
points, 809 
weights, 809 
quadrature rules, 809 
automatic, 817 
classes of, 809-810 
collocation and, 825-826 
quanta, 401 
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quantum efficiency, 690 
quantum-mechanical distribution, 685 
quantum-mechanical electronic interaction 
factor, 691 

quantum modes, 705-706 
quantum numbers, 683 
angular-momentum, 683 
magnetic-moment, 683-684 
principle, 683, 685 
spin-moment, 684 
total angular 684 
See also electrons 
quantum optics, 563 
quasi Monte Carlo, 310-312 
defined, 311-312 
estimand, 328 
estimand error for, 312 
estimator, 328 

See also Monte Carlo methods 
quasi-single-scattering model, 769 
quenching, 717 

quincunx lattice. See diamond lattice 

R 

radiance, 643, 653, 791 
argument, 977 
computing, 659 
definition, 654, 656 
differential, 663 
reflected, 668 
discussion of, 656-659 
distribution, 871, 886 
flux vs., 872 
incidence, 667 
leaving points, 656 
moon, 730 

perceived values, 1055 
of phosphors, 716, 717 
reflected, 738 
relative, 1061 
sphere, 723, 724 

radiance equation, 543, 544, 791, 871-882, 885 
absorption term, 874 
BDF and, 872-873 
blackbody term, 873 
fluorescence and, 874-875 
forming, 872-876 
FRE and, 875-876 
full, 876 

importance of, 871 


OVTIGRE, 880, 890 
phosphorescence and, 873-874 
singularities in, 794 
solutions to, 544 
solving, 880 
TIGRE, 877-878 
time-invariant, 877 
VTIGRE, 878-880 
See also radiance 
radiance exitance, 652, 653 
radiant energy, 651 
definition, 654 
density, 652 
radiant exitance, 888 
radiant flux, 652 
area density, 652 
definition, 654 
lumen, 661 
watt, 652 
See also flux 
radiation, 704-705 

of blackbodies, 708, 713 
solar, 764, 766 

radiation factor. See form factor 
radiators, 715 

radical-inverse function, 311 
folded, 311 

radiometric conversions, 648 
radiometric relations, 653-661 
radiometric terms, 649, 651-653 
list of, 650, 662 
spectral, 649-650 
radiometry, 543, 643, 647-677 
defined, 547 
definitions, 654 
examples, 672-675 
notation, 648-649 
projected areas and, 649 
projected solid angles and, 649 
reflectance and, 661-671 
spectral, 659 

spherical harmonics and, 675-677 
spherical patches and, 649-651 
radiosity, 885, 887-982 
adaptive, 961-964 

classical, 886, 888-900, 979-982, 1045 

defined, 887 

discontinuities of, 975 

distribution, 968 

driving function and, 966 
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radiosity (continued) 

due to children of a node, 960 

estimate, 913 

finding, 960 

Galerkin, 900 

gathering, 953 

heat transfer and, 982-983 

hierarchical, 900, 937-974 

higher-ordei; 899-900 

important, 968 

meshing, 974-976 

negative, 981 

power per unit area, 959, 960 
progressive, 934, 963, 1049 
reflected, 907 
result of, 888 
shooting, 953, 955 
shooting power, 976-979 
simulations, 981 
steps, 937 

strengths/weaknesses of, 1045 
of surfaces, 888 
transfer of, 968 
undistributed, 907, 909 
unshot, 907, 911 
view-independent solutions, 1045 
See also form factors; radiosity matrices 
radiosity algorithms, 881, 888, 967 
hardware implementations of, 984 
See also hierarchical radiosity (HR) 
algorithm 
radiosity matrices 

Gauss-Seidel iteration and, 907-909 
Jacobi iteration and, 907 
overrelaxation and, 913-914 
progressive refinement and, 911-913 
solving, 906-916 
Southwell iteration and, 909-911 
See also radiosity 
ramp function, 149 
derivative of, 150 
illustrated, 150 
random-dot stereogram, 35 
illustrated, 36 
single-image (SIRD), 35-37 
random nonuniform sampling, 411, 424 
random order breadth-first refinement, 494 
random variables, 300 
average, 301 
covariance of, 1101 


cumulative distribution function, 1099 
distribution function, 1098-1099 
negatively correlated, 1102 
normal, 301 

positively correlated, 1102 
probability and, 1098-1101 
transformations from uniform, 398 
uncorrelated, 1102 
uniformly distributed, 397 
random walks, 44 2 -444 
creating, 846 
five-step, 848 
range, 148 

range compression, 107 
partial, 109 

six possibilities foi; 109 
rasterization, 497 
rational fraction, 773 
ray equation, 1023 
ray law, 659 

Rayleigh-Jeans Law of Radiation, 711-712 
Rayleigh phase function, 760 
Schlick function vs., 763 
Rayleigh scattering, 760 

light distribution due to, 766 
modeling, 766 

ray-object intersection, 1023,1026,1031 
routines, 1023 
rays 

chief, 1016 

constructing, 1022-1023 
intersection of, 1029 
first, 994,1023 
missing sphere, 1025 
passing through sphere, 1025 
propagated, 1027-1029 
tangent to sphere, 1025 
time foi; 1021 

tree of, 990, 991,1033, 1071 
ray-sphere intersection, 103 
illustrated, 1024 
ray-traced form factors, 936 
ray-tracer, 119,122 

ray tracing, 372, 659, 885, 934-936, 987-1050 
architectures, 1050 
backward, 1023 
bidirectional, 1039-1044 
classical, 886,1010-1011,1042-1043, 1044 
defined, 987-988 

distribution, 492, 1011,1021-1035, 1042 
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ray tracing (continued) 
forward, 1023 

hybrid algorithms, 1044-1049 
implementation of, 1050 
overview, 1033, 1035 
strengths/weaknesses of, 1044 
See also photon tracing; rays; visibility 
tracing 

ray-tracing algorithms, 881 
ray-tracing function, 641 
ray-tracing projection method, 934-936 
disadvantage of, 935 
efficiency of, 936 
ray-tracing volumes, 1049-1050 
ray-tree comparison refinement test, 472-473 
Akimoto test, 472, 473 
refinement levels for, 473 
real functions 

complex-valued, 139 
real-valued, 139 
real index of refraction, 554 
real interval, 137 
real-linear systems, 134 
real numbers, 135 

complex conjugate of, 139 
See also complex numbers 
real object spectral information, 1191-1206 
reciprocal basis. See dual basis 
reciprocity, 666 
reciprocity relation, 918 
reciprocity rules, 895 
form factors and, 919 
reconstruction, 174, 341-346, 371, 411 
ID continuous signal, 336-340 
2D, 352-354 

after sampling with shah functions, 345 
bandlimited, formula, 342 
central-stai; 511-512 
defined, 331, 341 

evaluating incident light at point, 334-336 

in image space, 354-359 

interpolation and, 497-537 

low-pass, 363 

mechanics of, 352 

multidimensional, 365 

multistep, 532-535 

nonuniform, 371, 404 

piecewise-continuous, 507-517 

Painter and Sloan’s method, 512-515 
thin-plate splines, 515-517 


Whitted’s method, 507-509 
signal, 336 

from sum of sine function, 343 
spectrum, 358 
star, 483 

target, density, 496 
of uniformly sampled signals, 498 
zero-order hold, 344-346 
reconstruction errors, 344, 415, 518 
defined, 497 

reconstruction filters, 120, 342 
2D, 353 
box, 354-358 
choosing, 122-123, 343 
coefficients of, 526, 527 
Cook, 524-526 
Dippe and Wold, 527, 528 
Max, 527-529 

Mitchell and Netravali, 529-532 
multiplying, 354 
multistage, 534-536 
Pavicic, 523-524, 525 
pixel-based, 394 
selecting, 537 
summary of, 536 
reconstruction points, 498 
location of, 498 
rectangular 2D basis, 837 
rectangular deconstruction, 291 
rectangular distribution, 1102 
rectangular lattice, 417 
defined, 417 
illustrated, 416 
jittered, 327 

rectangular wavelet decomposition, 291-293 
basis functions foi; 292 
example, 293 

recurrent nonuniform sampling, 523 
illustrated, 522 
recursive visibility, 1002 
red-green chromatic channel, 44 
reference data, 1163-1206 
human data, 1169-1172 
light sources, 1172-1177 
Macbeth ColorCheckei; 1179-1190 
material data, 1164-1169 
phosphors, 1177-1179 
real objects, 1191-1206 
reference frames, 175 
3D, 176 
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reference white, 60 
refiltering, 408 
refinement, 371 

adaptive, 376, 377, 379, 409, 487 

Akimoto process, 491 

BF, 963 

BFI, 970, 971 

breadth-first, 494 

corner, 483 

criteria for higher-density sampling, 380 
F, 963 

function, 377 
importance-driven, 974 
initial sampling and, 463-465 
multiple-scale template, 497 
node, 494 

optimistic approach, 463 
patches undergoing, 943 
pessimistic approach, 463 
sample intensities and, 464—465 
straightforward, 376 
tree, 493 

refinement algorithm 
hierarchical, 942 
progressive, 942 
refinement criteria, 376 
implementing, 463 
refinement geometry, 480-497 
area bisection, 485—490 
linear bisection, 481-485 
multiple-level sampling and, 490-492 
nonuniform, 490 
sample, 480—481 

tree-based sampling and, 492-497 
refinement strategy, 376 
two-stage, 376 

refinement tests, 411,463, 465-480 
acceptance, 463 
Akimoto, 472, 473 
Argence, 470 
contrast, 467 
geometry, 411 
Hashimoto, 471 
intensity comparison, 465-467 
intensity difference, 466 
intensity groups, 466—467 
summary of, 474 
intensity statistics, 473-480 
confidence test, 476—479 
sequential analysis test, 479-480 


SNR test, 474 
summary of, 480 
t test, 479 

variance test, 475-476 
Jansen and van Wijk’s test, 467 
for links, 963 
object-based, 468-472 
Cook’s test, 472 
four-level test, 468-470 
mean-distance, 471-472 
object-count, 470-471 
object-difference test, 468 
ray-tree comparison, 472-473 
Roth test, 468 
samples in, 465 
types of, 465 
See also refinement 
reflectance, 661-671 
of blackbodies, 708 
defined, 661 
functions, 662, 724 
paint, 774 

for polarized light, 735 
types of, 669-670 
reflectance equation, 667, 669, 672 
double-integral form, 673 
reflectance factors, 662 
defined, 670 
types of, 671 

reflectance p, 662, 667-669 
defined, 667 
reflected radiosity, 907 
reflected vector, 573-574 
reflecting boundary condition, 634 
reflection, 594 
anisotropic, 566 
defined, 563, 661 
diffuse, 564 

directional, 744 
uniform, 744 
energy transport and, 592 
forms of, 563-564 
Fresnel, 734, 735 
geometry, 663 
gloss, 564-565 
interreflection, 979 
isotropic, 566 
law of, 1110 
mixed, 564 

from normal incidence, 739 
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reflection ( continued) 
perfect diffuse, 672 
perfect specular, 673-675 
retro, 564 

speculate 563, 564, 1105-1110 
geometry of, 573 
ideal, 744 

total internal (TIR), 574-576 
reflection functions, 334-336 
for moon, 730 
types of, 670 
reflectors, 889 
refraction, 14 

constant index of, 713-714 
defined, 574 
illustrated, 576 

index of, 567-572, 713, 1164-1165 
law of, 574,1111 
linear index of, 714 
relative index of, 734 
simple index of, 734 
region of support, 130 
regions, 483-484 
Voronoi, 508 
See also cells; samples 
rejection Monte Carlo, 308-309 
defined, 308 
error due to, 309 
estimand, 328 
estimator, 328 

See also Monte Carlo methods 
relaxation algorithm, Southwell-type, 914 
relaxation methods, 901 

overrelaxation, 901, 905-906 
residual and, 902 
underrelaxation, 906 
remaining payoff, 855 
illustrated, 856 

removal singularity method, 865, 866-867 
rendering, 1053-1072 
defined, 885 

device-directed rendering, 1069-1072 
feedback, 1064-1072 
house painter example of, 544 
importance sampling and, 849 
physically based, 885 
subjective, 1076-1077 
systems, 408 
volume, 1074-1075 


rendering algorithms, 885 
development of, 1076 
rendering equation, 879 
rendering methods, 408, 544 
solid angles and, 970-971 
See also radiosity 

RenderMan shading language, 752, 789 
repeated rules, 812 
repeated trials, 1097-1098 
reptiles, 253 

adaptive supersampling and, 420 
defined, 253, 420 
illustrated, 254 
resampling, 429 
frequency, 429 
grid, 418 

locations, 408,464 
at pixel center 483 
points, 498 
See also sampling 
residual, 902 
element, 904 
residual function, 801 
residual minimization, 800-801 
defined, 800 
resolution 

frequency, 289 
limit, 173 
signal, 244 
spatial, 173 
of strata, 997 
visible, 999,1002 
wavelets and, 252 

multiple resolutions, 282-285 
resolution-changing operator 838 
resolution of identity, 286 
resolved strata, 998 
applying, 999-1002 
resolvent kernel, 806 
resolvent operator 806 
responsive emissions, 681 
restriction operator, 267 
retina, 9 

packing patterns on, 55 
photosensitive cells, 15 
retinal disparity, 35, 37 
retinal ganglion cells, 29 
retro-reflection, 564 
RGB coefficients, 66 
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RGB color cube, 66 
illustrated, 67 

RGB color space, 66,100-106, 786 
interpolation of two colors in, 67 
XYZ conversion to, 104 
RGB monitors, 103 
rhodopsin, 15-16 
Riemann sum approximation, 312 
right-hand rule, 1140 
ringing, 518 

Ritz-Galerkin method. See Galerkin method 
rod model, 582-583 
illustrated, 583 
particle properties, 582 
scattering rule, 585 
See also flux 
rod ring, 11 
rods, 9 

adaptation, 21 
contrast sensitivity and, 24 
hyper-polarized, 19 
outside fovea, 11 
photopigment in, 15-16 
response of, 22 
See also rod model 
rogues, 520 

eliminating, 520 
root-mean-square (RMS) 
error, 519 

for microfacets, 737 
norm, 186 
roughness, 744 
SNR, 474 

root nodes, 954, 958, 959 
hierarchy foi; 961 
roughness, 737-738 
RMS, 744 
surface, 738 

Russian roulette, 847, 1032 

S 

sampled signals. See discrete-time (DT) signals 
sample-frequency ripple, 518 
samples, 119, 174 
aperiodic, 373 
base, 376 

clumping at edge, 534 
concentric rings, 413 
direct parameters of, 450 
distributing, 451 


distribution of, 445 
indirect parameters of, 450 
intensities of, 464-465 
light, 334-336 

modeled by loose/stiff springs, 515 
per pixel, 414 
“pilot” set of, 379 
point, 119, 334 
preciousness of, 122, 408 
refinement geometry, 480-481 
in refinement tests, 465 
with same value, 471 
rogue, 520 
See also cells; regions 
sample selection geometry, 485 
recursion and, 488 
sample set, 300 
sample size, 300 
sampling, 119 
in 2D, 347-352 

adaptive, 327, 371, 375, 376-381, 466 

aliasing caused by, 414 

anti-aliasing in pixel, 332-334 

aperiodic, 373, 381-385 

blind stratified, 309-310 

complete block, 1034 

continuous signal, 331 

credo, 122,408 

diamond pattern, 489 

distribution, 302 

downsampling, 271 

erroi; 412 

geometry, 412 

hexagonal, 419 

image-plane, 395 

implicit, 881, 882 

importance, 320-325, 392-398, 840, 841, 
990,1039 

incomplete block, 1033 

with incomplete block designs, 450 

informed, 388 

informed stratified, 319-320, 329 
jittered, 384, 385 
multidimensional, 365 
multiple-level, 490-492 
new, geometry, 411 
nonuniform, 369—404, 411-415, 424 
N-rooks, 424-426,451, 453 
oversampling, 340 
pixel, 361,414, 476 
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sampling ( continued) 

point, 333, 372,412,1036 
Poisson, 424, 425 
rate, 343 

instantaneous, 501 
local, 376 
low, 415 

multistage filter and, 535 
recurrent nonuniform, 522, 523 
sequential uniform, 492-495 
signal, 120, 336, 361 
space-time pattern, 452 
square pattern, 489-490 
stochastic, 373, 375, 411 
stratified, 310, 388-392, 395-398, 479 
supersampling, 359-365, 416 
system, 338 
tree-based, 492-497 
undersampling, 340, 398, 1033 
uniform, 332, 336-340, 411-415, 415-417 
upsampling, 271 
variance, 302 
of viewing plane, 987 
See also initial sampling patterns 
sampling density, 361 
high, 370, 379, 380 
proportional to intensity, 370 
uniform, 532 

uniform sampling and, 370 
variable, 369-371 
variant, 532 

sampling frequency, 339, 361 
at Nyquist limit, 344 
sampling lattice, 417 
sampling patterns 

comparison of, 386-388 
initial, 409-411 

See also initial sampling patterns 
sampling theorem, 336 

for uniformly spaced samples, 341 
See also uniform sampling 
sampling tiles, 437-440 
2D, 437 

continuous transformation to, 439-440 
nonuniform, 437, 438 
square, 439, 440 
saturation, 874 
scalar multiplication, 1085 
scalars, 440 

elements in vectors, 842 


scale, wavelet, 252 
scaled filter, 396 
scaled impulses, 202 
scaling coefficients, 253 
scaling factor, 1061-1062, 1063, 1064 
scaling function, 244, 253, 255, 282, 284 
regularity of, 279 
slowly changing, 1062 
wavelet construction from, 255 
scalogram, 287, 289 
scan conversion, 122, 926-927, 983 
scan lines, 97 
scanning algorithm, 453 
scattering, 584-587 
in 3D, 619-621 
approaches to, 593 
Mie, 760 
particles, 584 

space-time diagram, 586 
probability, 584 
quasi-single, 769 
Rayleigh, 760, 766 
rod model, 585 
rule results, 585 
volume, 619 

See also inscattering; outscattering 
scattering coefficients, 775, 777-778 
scattering functions, 758-764 
scene parameters, 1065 
Schauder basis, 258 
Schlick phase function, 762-763 
comparisons, 763 
defined, 762 
illustrated, 763 
values in, 763 
S cones, 16, 17 
score, 855 
scotophot, 770 
scotopic vision, 19 
CSF foi; 24, 26 

luminous efficiency functions, 21, 22 
secondary focal points, 1013 
selective filter, 76 
Sellmeier’s formula, 569-570 
simplifying, 570 
square root, 570 

as summation of resonance terms, 569 
using, 569-570 

semantic inconsistency errors, 108 
preventing, 111 
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semi-major axis length, 554 
semi-minor axis length, 554 
senkrecht, 734 
sequential analysis, 479 
refinement test, 479-480 
sequential probability ratio test (SPRT), 479 
sequential uniform sampling, 492-495 
series network, 162 
set of measure zero, 639 
shaders, 752-753 
shading, 381, 412, 543, 721-788 
accuracy and, 722 
anisotropy and, 740-743 
Blinn-Phong, 728 
color and, 786-788 
defined, 721 

directional functions and, 723 
geometry, 727,1007 
Gouraud, 59, 66, 976 
hierarchies of scale, 781-785 
language, 752 
Phong, 726 

precomputed BRDF and, 753-757 
programmable, 752-753 
surface, 757 
texture and, 780-781 
volume, 757-780 
shading exitance solid angle, 721 
shading models, 549, 721 
anisotropic, 740-743 
approximate, 722 
assumptions of, 722 
Blinn-Phong, 726-731 
Cook-Torrance, 731-740 
criteria for, 721-722 
empirical, 722, 747-753 
HTSG, 744-747 
Kajiya, 741-742 
Lambert, 726-731 
multiple-layer, 778 
parameterized, 752 
Phong, 726-731 
physically based, 722 
Poulin-Fourniet, 742-744 
programmable, 752-753 
Strauss, 747-753 
Ward, 750-752 
See also shading 
shading point, 738 

lining up BRDF with, 754 


shadow boundaries, 848 
shadowing 
defined, 732 
expressing, effects, 743 
illustrated, 733 
shadow mask, 73 
dot spacing on, 74 
pitch, 74 

shadow-mask technique, 900 
shadows, accuracy in, 848 
shah function. See impulse train 
sheen, 566 
shells, 684 

shifting functions, 253 
shift-invariant systems, 135, 164 
shoot importance, 971 
shooting patches, 936, 976 
shooting power, 909, 976-979 
directly from patches, 979 
shooting radiosity, 953, 963 
initializing, 955 
short-persistence phosphor; 72 
short-term Fourier transform (STFT), 244, 
246-252 

basis functions, 251 
defined, 246 
dot spacing, 287 
lattice, 288 

See also Fourier transforms 
shot noise, 520 
shutters, 1051 
sifting property, 152 

of impulse function, 156 
for intervals, 152 
for points, 152 
signal(s) 

ID, 120 

2D, 121, 165-169, 407 
aperiodic, 131, 197, 198, 199, 204 
autocorrelation of, 382 
bandlimited, 340-341, 363 
box, 153-154, 203-204, 235-238 
continuous, 128-129 
continuous-time (CT), 128-129, 363 
convolving, 120, 122 
cross-correlation of, 382 
DC component, 210 
decomposition of, 272 
defined, 127-128 
difference, 270 
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signal(s) ( continued) 
discontinuous, 195 

discrete-time (DT), 129-130, 161, 164-165 
dividing, 393 
downsampling, 271 
even, 129 

finite support, 158-159 
finite width of, 341 
flat, 210 

Fourier space representation of, 355 

Fourier transform of, 123 

frequency content of, 246 

frequency-space, 148 

half-flat, 394 

half-ramp, 394 

high-resolution, 269 

homogeneous, 479 

impulse, 148-153, 165 

impulse train, 154 

inhomogeneous, 479 

low-resolution, 269 

multidimensional, 127 

odd, 129 

oversampled, 340 

periodic, 130-132, 199, 201-202, 204 
period of, 130 
product, 120 
quantized, 130 
reconstructing, 336 
resolution of, 244 
sampling, 120, 336, 361 
sine, 155, 204, 341 
smoothed, 268-269 
subsampling, 265 
systems and, 127-169 
time, 213 
types of, 127-132 
uncertainties in, 400 
undersampled, 340 
upsampling, 271 
windowing, 247-249 
See also systems 
signal estimation, 409 
block diagram, 410 
signal processing 

braket notation in, 145 
digital, 169, 174 
for filter design, 299-300 
multidimensional, 169 
nonuniform, 369 


operations, 169 
theory, 299 
trick of, 122 
signal space, 364 

ID box filtering in, 356 
signal-to-noise ratio (SNR) 
powei; 456 
refinement test, 474 
RMS, 474 
See also noise 

simple hierarchical radiosity (SHE), 954 
simple index of refraction, 734 
simulations, 543 

sine function, 155, 204, 341, 355, 384 
2D separable, 387 
clipped, 365 
in frequency space, 358 
HR, 359 

single-image random-dot stereogram (SIRD), 
35-37 

single-plane projection method, 931 
advantage of, 931 
illustrated, 932 

singular integral equations, 864, 869 
singularities, 864-868 
benign, 865 

computer graphics and, 864, 868 
defined, 864 
handling methods, 865 
avoidance, 865 
coexistence, 865, 868 
divide and conquer, 865, 868 
factorization, 865, 867 
ignorance, 865 
removal, 865, 866-867 
ill-behaved, 867 
malignant, 865 
weakening, 866, 867 
well-behaved, 867 
size cue, 38-39 

illustrated example, 38 
size pass, 1047 
skip process, 381 
slits 

as cylindrical point source, 547 
distances of, 549 
experiments, 579 
two parallel, 546 
See also double slit experiment 
SML cone response curves, 1169, 1172, 1173 
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smoothed signals, 268 
illustrated, 269 

Snell’s law, 575-576, 578,1111 
equation, 576 
illustrated, 575 
sodium, 689 
solarization, 770 
solar radiation, 764, 766 
solid angles, 598, 599-605, 617 
2D, 599-600 
approximation, 602, 956 
defined, 599 
differential, 651 
hemispherical, 603 
illustrated, 618 
incident, 669 
integrating over; 605-606 
intersected objects and, 998 
inverse-square term in, 658 
multiple objects occupying, 605 
notation for, 603-605 
projected, 603, 649 

importance assignment and, 971 
properties of, 603 
ratio of flux and, 653 
reflected, 669 

rendering methods and, 970-971 
resolved strata method and, 999 
shading exitance, 721 
splitting, 621 
steradian, 600 
stratification on, 997 
tracing, 1008-1009 
types of, 669 
to zero size, 617 
source-importance equality, 861 
source-importance identity equation, 863 
source patch, 653 

Southwell + Jacobi iteration, 914, 915 
Southwell iteration, 901 
defined, 904 
illustrated, 906 
performance, 915 
radiosity and, 909 
shooting step, 912 
space-invariant filters, 517 
spaces, 127,146-148 
chord, 146-147 
combining, 282 
defined, 146 


frequency, 146, 148 
function, 1090-1091 
linear; 1085-1090 
nested, 284 
product, 137-138 
signal, 146, 148 

space subdivision, 1026, 1027-1030 
illustrated, 1029 
spatial resolution, 173 
spectral coefficients. See Fourier series 
coefficients 

spectral efficiency functions, 1170 
spectral locus, 51 
spectral radiometric terms, 649 
defined, 659 
list of, 650 

See also radiometric terms 
spectral radiometry, 659 
spectrogram 

frequency resolution, 289 
illustrated, 289 
for impulse function, 290 
STFT, 289 

for computing, 287 
for sum of three sines, 290 
wavelet, 287 
spectrum, 15 
box, 206-208 

containing infinite grid of replications, 350 
flat, 210 

inverse Fourier transform of, 206, 229 
not within limiting square, 351 
reconstructed, 358 
within limiting square, 351 
specular adjustment factor; 749 
specular gloss, 565, 566 
specular reflection, 563, 564, 573, 744, 
1105-1110 
geometry, 1109 
specular surface, 1049 
specular transmission, 566, 567, 577, 
1105-1108, 1110-1111 
geometry, 1110 
specular vectors 

computing, 572-578 
geometry, 573 
reflected vector; 573-574 
total internal reflection, 574-576 
transmitted vector, 576-578 
sphere equation, 1023 
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spheres 

around shading point, 724 
BRDF, 723, 724 
emission, 723, 724 
hemilune of, 673 
illumination, 723-724 
radiance, 723, 724 
spherical sector of, 673 
spherical harmonics, 675-677 
2D, 757 
defined, 675 
illustrated, 677 

precomputed BRDF and, 756-757 
with single index, 676 
for storing local illumination info, 756 
spherical patches, 649-651 
differential patch, 651 
spin-moment quantum number, 683 
spline interpolation, 523 
splitting 

input face, 621 
solid angle, 621 
spontaneous emissions, 681 
spots 

brightness of, 80 
Gaussian analysis, 112 
patterns of, 80 
spacing, 89 
apparent, 97 

See also display spot interaction; phosphors 
square basis, 838 
square deconstruction, 291 
square function, 253 
square-integrable function, 194, 1090 
square lattice, 420-424 
illustrated, 423 
subdivided, 423 
squares 

of isolation, 349, 350 
limiting, 351 

stratification of, 390, 391 
square tiles, 439 

circumscribing circles in, 440 
square wavelet decomposition, 293-296 
basis functions, 293, 295 
defined, 293 
example, 296 
See also wavelets 
standard deviation, 1101 
Gaussian, 208 


standard observer; 45,49 
standard one-speed particle transport equation, 
628, 630 

standard phosphors, 102 
coordinates for, 102 
standing waves, 709 
star patterns, 1055 
star reconstruction, 483 
central, 511-512 
See also reconstruction 
stationary value, 1107 
finding, 1109 
illustrated, 1108 
steady state, 595, 888 
Stefan-Boltzman constant, 714 
Stefan-Boltzman law for blackbody radiation, 
714 

steradian, 600 
stereoblind, 37 
stereolithography, 1078 
stereopsis, 35, 37 

Stevens and Stevens experiments, 1060 
stiffness matrix, 827 
Stiles-Crawford effect, 15 
Stirling’s approximations, 707 
stochastic ray tracing. See distribution ray 
tracing 

stochastic sampling, 373, 375, 411 
defined, 373 
illustrated, 373 
trading aliasing and, 375 
See also nonuniform sampling 
Stokes’ law, 715 
Stokes parameters, 558 
stop band, 221 
strata, 389 
choosing, 391 

in classical ray tracing, 1010-1011 

constructing, 1002 

direct, 1006, 1007 

directional, 1002 

equal-energy portion, 395 

illustrated, 389 

indirect, 1006,1007 

induced, on surfaces, 992 

on direction hemisphere, 992 

poor choice of, 390 

projection, 992 

resolution of, 997 

resolved, 998, 999-1002 
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strata (continued) 

signal broken into four, 390 
spatial, 1002 
splitting, 392 
visibility-resolved, 999 
See also stratification 
strata sets, 993-999 

direction-driven, 993-996 
directly visible points, 995 
point-driven, 996-999 
types of, 993 
stratification, 990, 1004 
for circular domain, 391 
distributed ray tracing and, 1031 
dynamic, 496, 1037 
on solid angle, 997 
of square, 390, 391 
See also strata 

stratified sampling, 310, 479 
adaptive, 391 
advantage of, 391 
blind, 309-310 
defined, 388 

importance sampling and, 395-398 
informed, 319-320, 329 
method of, 388-389 
Strauss shading model, 747-750 
color shifting, 750 
diffuse adjustment factor 749 
geometry, 748 

specular adjustment factor, 749 
surface parameters, 748 
stream, particle, 594 
streaming, 621, 622-623 
defined, 622 
explicit flux, 594 
flux due to, 623 
illustration, 622 
Student’s distribution, 305-306 
Student’s ratio, 306 
subdivided patches, 936 
hierarchy of, 942 
subdivisions, 483 

directional, 1026, 1030-1031 
initial sampling and, 487 
isosceles triangular, 485 
centered, 486 
levels, 508 

multidimensional method of, 1031 
right triangular 486 


space, 1026,1027-1030 
in Whitted’s method, 507 
subjective constraints, 1067-1069 
subjective contours 
defined, 52 
illustrated, 52 

subjective rendering, 1076-1077 
subsampling, 265 
successive substitution, 805 
defined, 805 
summations, 140, 222 
distinct terms in, 223 
infinite, 140 
reversing, 223-224 
switching order of, 231 
sum tables, 1035 
supersampling, 359-365 
adaptive, 243, 420 
cells, 416 
defined, 359 
impulse train, 362 
methods, 359-360 
model of, 362 
support interval, 130 
supremum (sup), 1087 
surface-based methods, 1001 
surface emission, 592, 593-594 
function, 634 
surface flux, 592 
surface points, 993 
set of, 994 
visible, 995 
surfaces, 888 

of constant amplitude, 550 
of constant phase, 550 
enclosure, 888 
hemisphere above, 608 
illustrated, 618 
paint on, 771 
projection, 925 
radiosity of, 888 
subdividing, 888, 975 
viewing, 881 

surface-scattering distribution function, 634 
surface-scattering function, 634 
surface shading, 757 

volume shading vs., 757-758 
surface texture, 780 
surface-to-surface form factors, 981 
surface-to-volume form factors, 981 



INDEX 


1-51 


symbolic expression, 111 
symbolic methods, 799, 804-808 
Fubini theorem, 804-805 
Neumann series, 806-808 
successive substitution, 805 
synesthesia, 1080 
synthesis equation, 193, 200, 224 
coefficients in, 226 
definition of, 206 
Fourier series, 198, 202 
See also analysis equation; Fourier transform 
synthetic images 

applications foi; 1079 
computed on digital computer, 117 
matching to reality, 1053-1054 
as multidimensional signals, 127 
reasons for creating, 3 
See also image synthesis 
systems 

2D, 165-169 
avalanche, 591 
balanced, 888 
defined, 132 

in equilibrium, 595-596, 888 
as filters, 155 
frequency response, 164 
impulse response, 156 
linear 

2D, 165-166 
time-invariant, 132-135 
LTI, 163-164,168, 215 
maps of, 132 
self-sustaining, 591 
shift-invariant, 135, 164 
signals and, 127-169 
steady state, 888 
subcritical, 591 
supercritical, 591 
types of, 132-135 
See also signals 

system transfer function. See frequency response 

T 

talbots, 660 

target reconstruction density, 496 
Tchebyshev approximation, 819, 830-831, 869 
Tchebyshev norm, 430, 1087 
Tchebyshev polynomials, 787 
telescoping sequence, 838 


templates 

cumulatively compatible, 492 
multiple-scale, 447, 497 
temporal smoothing, 18 
tensor products, 291 

for basis functions, 293 
sixteen, 292 
test body, 939 
texel, 781 
texel-mapping, 781 
texture 

defined, 780 
displacement, 781 
procedural, 780 
shading and, 780-781 
stored, 780 
surface, 780 
volume, 780 
texture gradient, 40 
example of, 41 
texture map, 780 
texture mapping, 755 
defined, 780 
texturing, 412 
thermal emission, 704 
thermal equilibrium, 708 
blackbody in, 709 
thin convex-convex lens, 1013 
formed by two spheres, 1015 
thin lens, 1013 

approximation, 1014 
formula, 1018 

geometry for imaging by, 1016 
triangles for model, 1017 
viewing environment through, 1022 
See also lenses 
thin-plate splines, 515-517 
disadvantage of, 516 
illustrated, 515 
utility of, 517 
three-dimensions. See 3D 
three-pass method, 1047 
TIGRE, 877-878, 1049 
defined, 877 
geometry, 878 
operator notation, 878 
See also VTIGRE 
tiles, 427 
2D, 437 
defined, 375 
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tiles (continued) 

minimum-distance criterion and, 429 
sampling, 437-440 
unit parameterization, 427 
See also patterns 
tiling, 348 

tilt block, IES, 1145-1149 
time-harmonic fields, 549-550 
time-invariant gray radiance equation. See 
TIGRE 

time-invariant systems, 134-135 
See also linear systems 
time of flight, 1106 
time signal, 213 

tone reproduction curve (TRC), 1058 
applied uniformly, 1061 
tone reproduction operator 1058 
Torrance-Sparrow microfacets, 732 
total angular quantum number 684 
total internal reflection (TIR), 574-576 
defined, 575 
total payoff, 855 
calculating, 860 
total probability, 1095-1097 
theorem on, 1096 
tracing 

beam, 1008,1035 
cone, 1009, 1035 
path, 844-848, 1012,1036 
solid angles, 1008-1009 
See also photon tracing; ray tracing; visibility 
tracing 

transition band, 220 
transition rules, 690 
transmission, 563-567 
defined, 566, 663 
diffuse, 566, 567 
mixed, 566, 567 

speculai; 566, 567,1105-1108,1110-1111 
geometry of, 577,1110 
transmitted vectors, 576-578 
transparencies, 71 
transport equation, 442, 582, 842 
basic, 637 

integral form of, 637-643 
light, 643-644 
notation for; 639 
solution to, 582 
transport theory, 543, 581 
transverse waves, 552 


trapezoid approach, 315 
trapezoid rule, 812 
tree-based sampling, 492-497 

adaptive hierarchical integration, 495 
dynamic stratification, 496-497 
hierarchical integration, 495 
sequential uniform sampling, 492-495 
See also sampling 
tree of rays, 990,1033 
building, 1033 
illustrated, 991 
for pixels, 1071 
See also rays; ray tracing 
t refinement test, 479 
triangles 

convolution with, 515 
for thin lens model, 1017 
triangular lattice, 420 
illustrated, 417, 421 
isosceles, 420 

triangular phosphor geometry, 76 
illustrated, 77 
triangular subdivision 
isosceles, 485 
centered, 486 
right, 486 
triple integral, 597 
triplet, 687 

Tumblin-Rushmeier model, 1058, 1059, 1060 
two dimensions. See 2D 
two-pass algorithm, 1045 
two-spot interaction, 78-79 
two-stage best candidate algorithm, 454 
generalization of, 454 
illustrated, 455 

two-stage refinement strategy, 376 
two-term Henyey-Greenstein (TTHG) phase 
function, 762 

for atmospheric scattering, 768 
for modeling Saturn rings, 768 
parameters for; 768 

U 

ultraviolet catastrophe, 712 
uncertainties, 400 
underrelaxation, 906 
undersampling, 340, 1033 
aliasing and noise and, 398 
undetermined coefficient equations, 811 
undetermined coefficients method, 810-812 
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uniform black field, 85 
illustrated, 86 

uniform sampling, 332,411-415 
ID continuous signal, 336-340 
ID Uniform Sampling Theorem, 340, 342 
2D Uniform Sampling Theorem, 351, 354 
attraction of, 411 
patterns, 415-417 
sampling density and, 370 
sequential, 492-495 
See also nonuniform sampling 
uniform white field, 94-96 
contrast for; 96 
illustrated, 86 
intensity, 96 
min, max for, 96 
point position, 94—95 
unit step function, 149 
upsampling, 271 

V 

vacuum time-invariant gray radiance equation. 

See VTIGRE 
valence, 695 
value function, 856 
vanishing moments, 258, 261 
variables, random, 301, 397-398, 1098-1101 
variable sampling density, 369-371 
variance 

centered, 399 
external, 497 
Gaussian, 208 
internal, 497 
probability, 475 
refinement test, 475-476 
vectors, 820 

antiparallel, 638 
basis, 176, 179, 813, 832 
bounding, 318 
column, 190 
component, 175 
direction, 598-599, 710 
error; 832, 834 
expansion for, 810 
function, 810 
Jones, 558, 559 
projection, 832 
reflected, 573-574 
span of, 179 

transformation rotating about, 823, 824 


transmitted, 576-578 
wave, 550, 553 
wavelet-transformed, 245 
vector space. See linear spaces 
vertical retrace, 97 
vibration ellipse, 556 
view-dependent algorithm, 881 
view-dependent solution, 881 
view-independent algorithm, 881 
view-independent solution, 881 
viewing plane, 881 
virtual image plane, 881 
virtual reality, 1079-1080 
visibility, recursive, 1002 
visibility function, 993, 1000,1001 
visibility-test function, 890 
visibility tracing, 988-989, 990-1037 
in a vacuum, 990 
beam tracing, 1008-1009 
camera models, 1013-1021 
cone tracing, 1009 
defined, 988 

with different strata, 1003 
distribution tracing, 1011 
illustrated, 989, 1004 
light-object interactions, 990 
mirrored ball in a room and, 1040-1041 
path tracing, 1012 
pulling visibility and, 993 
pushing visibility and, 993 
strata sets, 993-999 
using, 1047,1049 
in scenes, 991 

See also photon tracing; ray tracing 
visible resolution, 999 
computing, 1002 
See also resolution 
visible-surface function, 641 
illustrated, 642 
vision operator; 1056 
visual angle, 6 
defined, 6 
measurement of, 7 
visual band, 14, 660 
visual phenomena, 23-33 
contrast sensitivity, 23-28 
lightness constancy, 32-33 
lightness contrast, 31-32 
Mach bands, 29-31 
noise, 28 
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visual range, 14 

visual system. See human visual system 
volume emission, 621 
flux, 623 

volume inscattering probability function, 625 
volume methods, 981 

volume outscattering probability function, 624 
volume rendering, 1074-1075 
anti-aliasing methods, 1074 
applications, 1074-1075 
importance of, 1075 
mathematics of, 1074 
See also rendering 
volume shading, 757-780 

atmospheric modeling and, 764-769 
defined, 757 

Hanrahan-Krueger multiple-layer model 
and, 778-780 

Kubelka-Munk pigment model and, 770-778 
multiple-layer models, 778 
ocean and, 769 
phase functions and, 758-764 
surface shading vs., 757-758 
See also shading 
volume texture, 780 
volume-to-volume form factors, 981 
volumetric emission, 593 
Voronoi diagram, 315 
illustrated, 316 
VTIGRE, 878-880, 1049 
defined, 879 
geometry, 879 
operator notation, 879 
outgoing (OVTIGRE), 880 
See also TIGRE 

W 

Walsh functions, 170 
Ward shading model, 750-752 
anisotropic form, 751 
BRDF, 751 

chair photos and, 752 
geometry, 750-751 
parameters, 751-752 
warped function, 500, 502 
warping, 499-503 

computer graphics and, 503 
defined, 499 
wavefront, 550 
wavelength, 14 


decoupled, 877 
decoupled energy, 648 
defined, 549 
frequency of, 1021 

index of refraction as function of, 568 
power vs., 15 
wavelet basis, 243 

to HR algorithm, 964 
wavelet coefficients, 263, 265 
computing, 265 
conditions, 277-282 
finding, 270 

inner product computation with, 280 
nonzero, 277 
for real-world signals, 276 
wavelet functions 
basis, 263 
evaluating, 281 
Fourier transforms of, 262 
moment of, 260-261 
shape of, 281 

two-parameter family of, 261 
values at dyadic points, 281 
wavelets, 243 
2D, 291-296 
amplitude of, 265 
analysis property of, 244 
applications of, 297 
bandwidth ratio and, 287 
compression of, 274-276 
function of, 275 
illustrated, 274-275 
methods, 275-276 
in computer graphics, 245 
creating, 255 

Daubechies first-order wavelets, 279 
defined, 244 
development of, 244 
dilation equation and, 253-255, 267 
dilation parameter, 260 
first two generations of, 264 
four-coefficient, 279 
in Fourier domain, 285-291 
Fourier transform of, 255 
frequency adaption, 289 
Haar, 252, 257, 258-262 
2D, 125 

building up, 257 
defined, 258 
matrix, 271 
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Haar ( continued) 
mother, 260 

space combinations and, 282, 283 
zero-order matching properties, 278 
hat functions and, 258 
higher-order, 255 

high-frequency information and, 837 
introduction to, 297 
level of, 263 
moments and, 260-261 
mother, 244, 258 
multiresolution analysis, 282-285 
multiresolution framework for, 284 
normalization term, 267 
order of, 279, 280 
orthogonal, 260 
orthonormal basis and, 244 
position of, 263 
as projection method, 837-838 
reason for studying, 244-245 
rectangular decomposition, 291-293 
resolution and, 252 
at same scale, 262 
scale and, 252 
from scaling function, 255 
series of, stacked together 266 
square decomposition, 293-296 
stretched in time, 286 
translation parameter, 260 
See also Haar basis 
wavelet spectrogram, 287 
wavelet transforms, 243-296 
computing, 263, 297, 837 
defined, 243 
function of, 244 
Haar, 263 
inverting, 277 
Parseval relation for, 286 
pattern for, 287 
principles of, 263 
See also wavelets 
wave packet, 563 
waves 

cylindrical point source of, 547 
homogeneous, 550 
inhomogeneous, 550 
interference between, 549 
periodic, 549 
plane, 550, 551 
spherical point source of, 546 


standing, 709 
transverse, 552 
wave theory, 562 
wave vectors, 550 
illustrated, 553 
Weber fraction, 24 
weighted-average filter, 532 
weighted averages, 301 
weighted Monte Carlo, 312-315 
convergence properties, 314 
defined, 312 
estimand, 328 
estimand error, 314 
estimator, 328 
multidimensional, 315-319 
See also Monte Carlo methods 
weighted particles, 847 
weighted process, 381 
weight function, 809, 1089 
weights 

kernel, 862 
quadrature, 809 

white field. See uniform white field 
white noise, 402 
Whitted’s method, 507-509 
subdivision in, 507 
weight assignment and, 509 
windowed filters, 220-221 
window function, 246-247 
windows 

analysis, 247 
box, 247 
Gaussian, 249 

Wyvill and Sharpe’s method, 509-512 
assumption, 512 
illustrated, 510 

X 

XYZ color space, 59-60 
L*u*v* conversion, 61 
recovering, 64 
RGB conversion, 104 
spectra conversion, 104-106 
See also color spaces 

Y 

Yen’s method, 522-532 
Bouville, 526-527 
Cook’s filter, 524-526 
Dippe and Wold, 527 
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Yen’s method ( continued ) 

Max, 527-529 

Mitchell and Netravali, 529-532 
Pavicic, 523-524 
sample patterns, 522 

Z 

Zaremba sequence, 311 

discrepancy due to, 456, 458 
pattern, 459 
z-buffer; 163 
approach, 926 
scan converter, 119 


Zeeman effect, 687 

zero-order hold reconstruction, 344-346 
illustrated, 346 
model of, 346 
zero-order matching, 278 
zero-variance estimation, 320-321 
zonal methods, 981 
zones, 601 

geometry of, 601 



Errata 


• iv / copyrights / Andrew 

replace from “© 1995 by Morgan Kaufmann..to “.. .permission of the publisher” with 
“All contents are copyright (c) 2010 by Andrew Glassner. This book is not in the public 
domain, but it has been made available by the author under the terms of the Creative 
Commons Attribution-NonCommercial 3.0 Unported License. Contact information for 
the author is available at www.glassner.com” 

• xxvi / 4th paragraph / Andrew 
Dan Russel should be Dan Russell 

• xxvii / 5th paragraph / Andrew 

"Eric Braun, Chuck Mosher, and Pamela St. John helped me..." 

• Pg. 6 / 2nd paragraph / David Salesin 

"as shown in Figure 1.3 -88 this is the center of the crystalline lens." 

• Pg. 23 / 1st paragraph / David Banks 

"...comes from trying to interpret intriguing phenomena..." 

• Pg. 49 / 1st paragraph / David Salesin 

"Thus to match C(\lambda) using standard sources x_s(\lambda), y_s(\lambda), 
z_s(Mambda)" 

• Pg. 49 / Eq. 1.1 / David Salesin 

C(\lambda) = X x _s(\lambda) + Y y _s(\lambda) + Z z _s(\lambda) 

• Pg. 49 / 3rd paragraph / Gary Bishop 

Final sentence should read, "The triangle in Figure 3.32 shows the subset of colors..." 

• Pg. 65 / Equation 2.9 / Ioana Danciu & John C. Hart, and Neil Gatenby 

The denominators 500 and 200 need to be multiplied by L*, and the + sign in the second 
equation should be a minus. 

X = Xn[(Y/Yn) A (l/3) + (a*/(500 L*))] A 3 
Z = Zn[(Y/Yn) A (l/3) - (b*/(200 L*))] A 3 

• Pg. 65 / Equation 2.9 / Jeremy D. Wendt 

The equations for converting XYZ to LAB use an approximation (Equation 2.5) that isn't 
used when coming back, which can introduce some distortion. Here's a snippet of pseudo¬ 
code by Jeremy to convert LAB to XYZ. pow(a,b) computes a to the b power, and cbrt(x) 
computes the cube root of x (cbrt(x)=pow(x,l/3)). 

LABtoXYZ 

if (L* == 0) { return (0, 0, 0); } 

Y/Yn = L* / 903.3; 
if(Y/Yn<.008856) { 

Y = Y/Yn * white.Y; 

X = white.X * ((a*)/(500 * (L*) * 7.787) + Y/Yn); 



Z = white.Z * (Y/Yn - (b*)/(200 * (L*) * 7.787)); 

} else { 

Y = white.Y * pow((((L*)+16.0)/116.0), 3); 

Y/Yn = cbrt(Y/white.Y); 

X = white.X * pow((Y/Yn+((a*)/500.0/(L*))), 3); 

Z = white.Z * pow((Y/Yn-((b*)/200.0/(L*))), 3); 

} 

return XYZ 

• Pg. 76 / line 6 / Shubhabrata Sengupta 

In the phrase "For example, phosphors may be impregnated with pigments that absorb 
light near..." replace "absorb" with "reflect" 

• Pg. 79 / Figure 3.7 / Russell Corfman 

The label for the horizontal axis should replace the word "units" with "multiples" 

• Pg. 82 / Next-to-last paragraph / Russell Corfman 

In the last sentence, replace phrase "white to black pixels" with "white to total pixels" 

• Pg. 82 / Equation 3.9 / Russell Corfman 
Replace denominator with "total number of pixels" 

• Pg. 83 / Figure 3.12 / Russell Corfman 

The phosphor labeled "D_3" should be labeled "B_3" 

• Pg. 85 / Last sentence / Russell Corfman 
Replace "Table 3.5" with "Table 3.6" 

• Pg. 92 / Second paragraph / Russell Corfman 
Replace "Table 3.8" with "Table 3.9" 

• Pg. 102 / 1st paragraph / Gary Bishop 

"Referring to Figure 3.32, the triangle shows the subset..." 

• Pg. 119/ 3rd paragraph / David Banks 

"...we want to show; consider it as a grey-level image...." 

• Pg. 133 / Figure 4.6 / Andrew Kunz 

The + in the lower-left corner should be a multiplication sign. 

• Pg. 139 / 3rd paragraph / Anil Hirani 
Angle z = tan A {-l }[Im(z)/Re(z)] 

• Pg. 141 / Table 4.1, Property E13 / Christian Laforte 
The denominator should be simply "2" 

• Pg. 142 / Equation 4.12, line 3 / Michael Cox (also Mark Bolstad, Darrell Plank, David 
Banks) 

The second exponential doesn't have the e on the baseline: 

=e A (j\omegat) e A (j\omega2\piAomega) 




Pg. 143 / 2nd paragraph / Iliyan Georgiev, David Banks, Wolfgang Stuerzlinger-Protoy 
"...the German word eigen, meaning "innate" or "own"." 

Pg. 144 / 7th paragraph / David Banks 
The utility of the braket notation is that it... 

Pg. 149 / Figure 4.13 / Christian Laforte 

The caption should read "The unit step function shifted right. The graph shows u(t-l)." 

Pg. 149 / Equation 4.25 / Christian Laforte 

The minus sign before the integral should be deleted. 

Pg. 151 / Equation 4.31 / Anil Hirani 

In the right-hand side, the argument to f should be (\tau/a) 

Pg. 156 / Eq. 4.45 / David Banks 

The argument to f inside the integral should be \tau, not t: 

L{f(t)} = \int f(\tau) h(t-\tau) d\tau 

Pg. 156 / Eq. 4.46 / David Banks 

The argument to f inside the integral should be \tau, not t: 
f(t) * h(t) = \int f(\tau) h(t-\tau) d\tau 

Pg. 158/Eq. 4.48/Yang Liu 

The argument in the delta function is incorrect: 

y(t) = f(t) * s(t) = \int f(tau) \sum_k \delta(t-kT-\tau) d\tau 

Pg. 161 / 3rd paragraph / Shubhabrata Sengupta 

The text should read "representing the footprint of the beam on the face of the tube" 

Pg. 164/Eq. 4.54/Yang Liu 

The exponent is missing a minus sign: 

H(\omega) = \int h(\tau) e A (-\omega \tau) d\tau 

Pg. 164 / Eq. 4.56 / Gladimir Baranoski 
h*k on left-hand side should be h*x 

Pg. 167 / Eq. 4.63 / Yang Liu, Dietmar Dreyer 

The "g" on the right side of the first equation should be an "h", and the arguments to f 
should be \eta and \zeta 

g(x,y) = f*h = \int \int f(\eta, \zeta) h(x-\eta, y-\zeta) d\eta d\zeta 
Pg. 167 / Eq. 4.64 / Yang Liu, Dietmar Dreyer 

The "g" on the right side of the first equation should be an "h". Each "x" should be an 
"m", each "y" should be an "n". The arguments to f should be kl and k2: 
g[m,n] = f*h = \sum_kl \sum_k2 f[kl, k2] h[m-kl, n-k2] 

= h*f = \sum_kl \sum_k2 h[kl, k2] f[m-kl, n-k2] 

Pg. 179 / 4th line from bottom / Gladimir Baranoski 
"...then one more more of..." should be "...then one more of..." 



Pg. 170 / Exercise 4.1 / Christian Laforte 
The last two parts should be labeled (f) and (g) 


Pg. 180 / Eq 5.9 / Nicolas Tenoutasse 
The left-hand side should be ll\phi_ill A 2 

Pg. 182 / Section 5.2.4 / Darrell Plank (David Banks) 

Something went wrong during the final copyediting of this section, and the discussion on 
duals got confused. This covers the material from the start of the section to the paragraph 
after Equation 5.20. It will need a minor rewrite to fix things up. The rest of the section, 
which discusses Gram-Schmidt orthogonalization, is correct. 

Pg. 182 / Before Eq. 5.18 / Darrell Plank 

"That is, for real functions a_i, and unit-length duals a_k: M 

Pg. 183 / Last line / Darrell Plank 

"...by the following algorithm (with temporary non-unit vectors s_i):" 

Pg. 184 / Eq. 5.23 / Darrell Plank 
second line should begin with s_i, not v_i. 

New third line: 
v_i = s_i /I s_i I 

Pg. 184 / Equation 5.23 / Cass Everitt 
Denominator of last expression should be <v_klv_k> 

Pg. 185 / Eq. 5.25 / Darell Plank (Michael Cox) 

"n-m" in exponent should be "m-n" 

Pg. 186 / First line / Darrell Plank (Michael Cox) 

"n-m" in exponent should be "m-n" 

Pg. 186 / Eq. 5-27 / Darrell Plank (Michael Cox) 

All four instances of "n-m" should be "m-n" 

Pg. 186 / Last paragraph / Gladimir Baranoski 
Reference to Figure 5.3 should be Figure 5.6. 

Pg. 186 / Equation 5.33 / Gladimir Baranoski 
\phi_m should be \phi_n 

Pg. 195 / Caption for Fig. 5.7 / Darrell Plank 
"(b) The function lx-21 has no finite discontinuity." 

Pg. 198 / Equation 5.51 / Dietmar Dreyer 

In the first line of this equation, X_c(t) should be x(t) 

Pg. 199 / 4th line from bottom / Gladimir Baranoski 

the word periodic should be aperiodic, reading "...our approximate aperiodic signal 



Pg. 201 / Equation 5.59/ Gladimir Baranoski 
In the first two lines, dt should be d\omega. 

Pg. 202 / Equation 5.61, line 4 / Thiago Ize 

Insert a k in the exponent after j, making e A {j k \omega_0 t} 

Pg. 204 / Last paragraph / Gary Bishop 

Delete whole paragraph. Replace with "We can observe a few similarities and differences 
between Figure 5.14 and 5.15. As the period T of the periodic box function increases, the 
samples in the discrete transform begin to pack together more densely. After appropriate 
scaling, the discrete pulses in Figure 5.14 begin to look ever-more like the continuous 
sine function in Figure 5.15." 

Pg. 214 / Eq. 5.91 / Harrison Ainsworth 
The exponent is missing a minus sign: 

H(\omega) = \int h(\tau) e A (-\omega \tau) d\tau 

Pg. 216 / Equation 5.93, lines 2 and 3 / Ben Luna 

In lines 2 and 3, the exponent j \omega t should be -j \omega t 

In line 3, f(t) should be f(\tau) 

Pg. 216 / Equation 5.94 / Ben Luna 
In the first 3 lines, f(t) should be f(\tau) 

In line 1, the exponent j \omega t should be -j \omega t 
In line 1, move "dt" to the right, just before "d \tau" 

Pg. 217 / Equation 5.97 / Gladimir Baranoski 
X(\omega) should be F(\omega). 

Pg. 218 / last line of text / Gladimir Baranoski 
Reference to Equation 5.83 should be Equation 5.82. 

Pg. 232 / Table 5.3 / Ben Luna 

In the next-to-last line in the Spectrum column, the denominator w\pi should be 2\pi 
Pg. 242 / Exercise 5.6 / Ben Luna 

"Equation 5.23" should be Equation 5.17, and "Equation 5.17" should be Equation 5.23 

Pg. 246 / 3rd line from bottom / Steve Hollasch 
"short-time Fourier transform (or STFT)." 

Pg. 255 / Equation 6.16 / Ju-Wei Huang 

The second line should read 

y_0(4t) + y_0(4t-l) + y_0(4t-2) + y_0(4t-3) 

Pg. 254 / Equation 6.14 / Jim Blinn 

The range of the first clause should not include t=l: 

y_0(t) = 1 0<=t<l 



Pg. 254 / Figure 6.5 / Jim Blinn 

In the leftmost figure of (a), the top and right lines of the square should be dashed 

Pg. 263 / Second paragraph / Jim Blinn 
"...the simple function shown in Figure 6.12(a)." 

"...are shown in Figure 6.12(b)." 

Pg. 263 / After Equation 6.29 / Jim Blinn 
"This is shown graphically in Figure 6.12(c)." 

Pg. 263 / 4th paragraph / Jim Blinn 
Switch the two w terms: 

"...we will sometimes write w A {0,0}(t) as w A 0(t)." 

Pg. 266 / Figure 6.13 / Mark Bolstad 

Fifth line. The right-side wave should be inverted; the numbers should be -2 and 2, and it 
should go down on the left and up on the right. 

Pg. 267 / Eq. 6.32, 3rd line / Mark Bolstad 
The left-hand should read H(a,b,c,d) = ... 

Pg. 267 / Eq. 6.32, 4th line / Alade Tokuta 
The right hand side should read = (a, -a, b, -b) 

Pg. 268 / Eq. 6.38 / Ju-Wei Huang 

The 2-by-2 matrix should be 1 row of 2 elements, each with value 1/2. 

Pg. 270 / Equation 6.41 / Jim Blinn 

All references to cO and cl should be subscripted as c_0 and c_l 

Pg. 271 / Equation 6.49 / Ju-Wei Huang 
The equation should read 
A: y[k] = (x[k] + x[k+l])/2, k is odd 
(x[k] + x[k-l])/2, k is even 

Pg. 273 / Figure 6.16 / Alade Tokuta 

The wavelet coefficients for [b A { l,0},b A { 11}] shuld be [1, -2] 

Pg. 275 / Figure 6.17g / Russell Corfman 

The first segment should have a value of 0, and the number should be 0 

Pg. 276 / After Eq. 6.55 / Jim Blinn 
24 should be 22 in text: 

"...has an error of only 22." 

Pg. 276 / Eq. 6.56 / Jim Blinn 
Delete repeated 20: 

E_2 = [(*, 34, 18, 20, 16, 22, 18, 22] 




• Pg. 276 / Eq. 6.58 / Mark Bolstad 

The pairs are in the right order, but the elements are reversed: 

{(1,4),(5,-3),(7,3),(4,-2),(8,2),(2,-1),(3,1),(6,1)} 

• Pg. 279 / Eq. 6.62 / Russell Corfman (Jim Blinn) 

The - and + signs for c_l and c_2 are swapped. 

• Pg. 281/ 4th paragraph / Jim Blinn 
"...from Equation 6.10 for v(l) and v(2)." 

• Pg. 281/ Next-to-last paragraph / Jim Blinn 
"The corresponding eigenvectors v_l and v_2 are" 

• Pg. 281 / Last paragraph / Jim Blinn 
"...we have found that [v(l), v(2)]=..." 

• Pg. 286 / 3rd line of text / Jim Blinn (and Kenneth Tsui) 

Second phrase should begin with \Phi(\omega): 

"...see that if \Phi(\omega)= P(\omega 12) \Phi(\omega /2), then \Phi(\omega)=P(\omega 
/2)[P(\omega /4) \Phi(\omega /4)], and so..." 

• Pg. 310 / Equation 7.34 / Gladimir Baranoski 
\xi_i should be \xi_{i,j}. 

• Pg. 316 / Figure 7.3 / Paul Heckbert 

The Voronoi diagram on the left isn't exactly right. Each line should be perpendicular 
bisector of the (undrawn) line that joins the relevant pair of dots. The lines in the figure 
are slightly askew. 

• Pg. 347 / Equation 8.17 / Gladimir Baranoski 
In first line, the second g should be s: 
g[m,n] = f(x,y)s[m,n] 

• Pg. 355 / sixth line / Gladimir Baranoski 

The phrase "equivalent to multiplying g(t) with a box" should be "equivalent to 
convolving g(t) with a box" 

• Pg. 355 / Equation 8.35 / Gladimir Baranoski 

The second set of braces in the second line should be preceded by an F. 

• Pg. 362 / Figure 8.27 / Gladimir Baranoski 

In the leftmost box, the expression p/q should be p/a. 

• Pg. 380 / Figure 9.6 / Gladimir Baranoski 
Line 11 should read "retum(\mu) M 

• Pg. 381 / 4th paragraph / Andrew 

"applied to graphics by Dippe and Wold [124], following the original application by Cook 
et. al [102]." 




Pg. 395 / Equation 9.32 / Cuneyt Ozdas 
f[n] should be f[i] 


Pg. 403 / Line after Equation 9.51 / Gladimir Baranoski 
and the support of the power spectral density is given by 

Pg. 423 / Table 10.1 / Cuneyt Ozdas 

The entry under "Total Tiles" for generation 4 should be 256, not 266 
Pg. 425 /1st paragraph / Cuneyt Ozdas 

Delete "in an L-shaped pattern - two squares in one direction and one square orthogonal 
to it." 

Pg. 426 / Figure 10.18 / Cuneyt Ozdas 

In caption, replace "Regular sampling" with "Regular sampling with jitter" 

Pg. 431 / Figure 10.24 / Stephan Schaefer 
Line 4: d_max should be initialized to 0, not 1 
Line 10: both subscripts should be k, not j 
Line 16: d_min should be d_max 

Pg. 435 / Figure 10.27 / Andrew 

Don't use regular grid on bottom level. Two cones from bottom level disappear. 

Pg. 437 / Fourth line / Gladimir Baranoski 

Larger should be smaller: "Smaller values will cause the algorithm..." 

Pg. 442/ Figure 10.32 / Gladimir Baranoski 
Line 4: 2D_{c,r} should be 2D_{c, r-1} 

Pg. 442 / Caption Fig. 10.33 / David Banks 
delete first 'n' in word starting 'bous...': 

"... algorithm with a boustrephedonic scanning." 

Pg. 474 / Table 10.6 / Gladimir Baranoski 

"Mean distance" in bottom-left box should be "Ray-tree comparison". 

Pg. 476/ Equation 10.26 / Gladimir Baranoski 
The two letters are reversed; it should be "P <= p" 

Pg. 476 / Last paragraph / Gladimir Baranoski 
Second line should read: "...rather than 1-d." 

Pg. 484 / Figure 10.68 / Gladimir Baranoski 
Figure (c) should be white to the right of the curve. 

Figure (e) should have a curve and shade in the upper-left. 

Pg. 510/ Last paragraph / Andrew 

The first sentence of the last paragraph should read: 

To illustrate, suppose that we have a diagonal edge that intercepts the top and right of the 
pixel, at p_t and p_r, respectively. 




Pg. 544 / Fifth paragraph / Mike McCarthy 

In first sentence, replace "indelightradiance" with "radiance" 

Pg. 544 / Sixth paragraph / Mike McCarthy 

Two lines from the bottom of the paragraph, replace "indelightradiance" with "radiance" 
Pg. 570 / Equation 11.35 / Davide Selmo 

In the expressions for S and T, \eta_l should be \eta_2 (U and V are fine). 

Pg. 575 / last line of the page / Andrew 
The angle in the last line should be theta: 

"... critical angle \theta_c may be found..." 

Pg. 576 / Equation 11.50 / Andrew 

The arguments of both sines should be \theta: 

\eta_i sin \theta_c = \eta_t sin(\pi/2) 
and 

sin \theta_c = \eta_t / eta_i 

Pg. 576 / Eq. 11.51 / Werner Jainek 
The two \eta arguments should be \theta: 

\eta_i(\lambda) sin \theta_i = \eta_t(\lambda) sin \theta_t 

Pg. 577 / Last line / Francesc Sala 
Exchange sin and cos: 

Now we can see from the construction that IT_\perpl = sin \theta_t and IT_\parl = cos 
\theta_t. 

Pg. 584 / 2nd pph, line 8 / Francesc Sala 
Change a(l-\sigma) to l-a\sigma : 

...the probability that it will escape without collision is l-a\sigma. 

Pg. 592 / Reflection, line 5 / Francesc Sala 
Change a to x: 

...back into the rod as right-moving flux at x=0... 

Pg. 594 / Eq. 12.20, line 2 / Andrew 
Second term subscript should be e: 

\Phi_s + \Phi_e + \Phi_i - \Phi_a - \Phi_o 

Pg. 595 / Eq. 12.26, line 1 / Francesc Sala 
Second term subscript should be e: 

\Phi_s + \Phi_e + \Phi_i - \Phi_a - \Phi_o 

Pg. 595 / Eq. 12.26, line 2 / Francesc Sala 
First function in square bracket should be e: 

= \Phi(x, R) + YDeltax[e(x, R) + \Phi(x + YDeltax, L) \sigma_s(x, L\rightarrow R)] 



• Pg. 601 / Third paragraph/ Gladimir Baranoski 

The second word of the next-to-last sentence should not be "radius", but instead "area". 
The final expression in the paragraph should not be \pi a A 2, but instead \pi \alpha A 2. 

• Pg. 602 / Second paragraph/ Gladimir Baranoski 
The parenthetical expression should be "b«r" 

• Pg. 602 / Equation 12.33 / Gladimir Baranoski 
Replace all appearances of "d" with "a". 

• Pg. 602 / Line after Equation 12.33 / Gladimir Baranoski 
Replace "where r = \sqrt(S/4\pi)" with "where a = \sqrt(S/4\pi)" 

• Pg. 615 / Third line from bottom / Alex Kulungowski 

Replace "perpendicular to the surface" with "parallel to the surface" 

• Pg. 616 / Figure 12.25 / Alex Kulungowski 

In parts (b) and (c), the labels for "n" and "c" should be swapped 

• Pg. 616 / Paragraph after Equation 12.42 / Chris Faigle 

Second sentence ending "that is, c\cdot n = 0" should read "that is, c\cdot n=l". 

• Pg. 625 / Figure 12.34 / Atin Atul Kothari 

Put primes on incoming directions (left sides of a and b), take them off the outgoing 
directions (right sides of a and b) 

• Pg. 631 / Eqs. 12.70, 12.71 and 12.72 / Alex Kulungowski 
Each of the curly "R"s should be "R A 3" 

• Pg. 638 / Figure 12.38 / Atin Atul Kothari 

The leftmost label on the bottom row should read r-2\omega 

• Pg. 642 / Equation 12.99 / Alex Kulungowski 

In the next to last line, in the right-hand side for $\mu(r,s)$, the function name "\sigma_c" 
should have a hat over it 

• Pg. 643 / Fourth line / Gladimir Baranoski 
Replace "flux \Gamma" with "flux \Phi" 

• Pg. 650 / First paragraph / David Banks 

Replace "subtends an arc of length dtheta" with "subtends an arc dtheta". 

Replace "subtends an arc of length dpsi" with "subtends an arc dpsi". 

• Pg. 650 / Table 13.1 / Chris Faigle 

In leftmost column of lines 3-6, replace each U with a Q 

• Pg. 650 / Table 13.1 / Gladimir Baranoski 

In 2nd and 3rd lines from the bottom, replace "d\Phi" with d A 2\Phi". 




Pg. 650 / Table 13.1 / Torsten Techmann 

In lines 10, 13, and 16 the definitions should have a denominator of "d_\lambda" rather 
than "dA A \Phi". 

Pg. 651/ Second paragraph / Chris Faigle 

Replace "(which is parallel to the N vector)" with "(which is perpendicular to the N 
vector)" 

Pg. 654 / Figure 13.2 / Chris Faigle 

The angle from the bottom disk in the leftmost image should be labeled "\theta_R" 

Pg. 655 / Equation 13.16 / Chris Faigle 
The final superscript "s" should be a subscript. 

Pg. 666 / Figure 13.8 / Ugo Erra 

The projection "dw_iN" should be "dw_i A N". 

Pg. 672 / Equation 13.58 / Ugo Erra 
The term \Omega_i should be \Omega_o: 

1 = \rho(\omega_i -> \Omega_o) = ... 

Pg. 674 / Equation 13.63 / Gladimir Baranoski 
The first term on the right-hand side should read 
f_{r,\theta}(\cos A {-l }u_r \rightarrow \cos A {-l }u) 

Pg. 675 / Eq. 13.71 / Peter Shirley 

Place parentheses around each instance of cos \theta: 

-N_{l,m} P_{l,m}(cos \theta) cos (mf) / if m>0 

Y(q,f) = N_{1,0} P_{l,0}(cos \theta) /\sqrt{2} / if m=0 
-N_{l,m} P_{l,-m}(cos \theta) sin(-mf) / if m\lessthan 0 

Pg. 677 / Figure 13.10 / Bob Lewis, Jonathan Blocksom, Francois Sillion 
These plots are incomplete; they're missing some lobes. For a more complete picture of 
these functions in a computer graphics book, see Figure 7.13 (page 173) in "Radiosity & 
Global Illumination" by Francois Sillion & Claude Puech (reference [409]) or Figure 
10.34 (page 314) in "Radiosity and Realistic Image Synthesis" by Michael Cohen and 
John Wallace (reference [99]). 

Pg. 684 / Last line/ David Banks 

Replace "Is electrons?" with "electrons with the same quantum numbers?" 

Pg. 685 / End of 2nd paragraph / David Banks 

Replace "is written Is." with "is written Is (the superscript if of omitted if its value is 1; a 
missing superscript is always taken to be 1)." 

Pg. 693 / 1st paragraph / Andrew 

"each term as a logarithm, in terms of f_l and f_2, the number of electrons in each level:" 

Pg. 694 /5th paragraph / Andrew 
"...experiments, we find that L corresponds to..." 





• Pg. 699 / Equation 14.20 / David Banks 

In the second line, replace the + sign by a - minus sign. 

• Pg. 722 / 3rd line from bottom / Atin Atul Kothari 
eliminate bogus word singleindeRGBRGB: 

...description to a single RGB color... 

• Pg. 726 / Equation 15.3 / Caroline Dahllof 

The first S_i should be -S_i, giving R_i = -S_i + 2(S_i. N)N 

• Pg. 728 / 1st paragraph in section 15.2.1 / Mike McCarthy 

In third line, insert word "be" after "model for a material that may". 

• Pg. 730 / 1st Paragraph / David Banks 
"topology" should be "topography" 

• Pg. 735 / Figure 15.8 / David Salesin 

Expanded scale on right side of graph is not marked. 

• Pg. 736 / Figure 15.9 / David Salesin 

Expanded scale on right side of graph is not marked. 

• Pg. 736 / 1st paragraph / Steve Worley 

The first word on the page should be absorption, not reflection. 

• Pg. 736 / beginning of 2nd paragraph / Steve Worley 

Insert the sentence “F_r is the same as our earlier F, and F_t=(1.0-F_r).” 

• Pg. 737 / Equation 15.13 / Chung-Fa Chang 

The + sign in the denominator within the braces should be a - sign. 

• Pg. 749 / Paragraph after Equation 15.30 / Gladimir Baranoski 
Replace "adjustment factor d_m" with "adjustment factor d_a" 

• Pg. 749 / Equation 15.32 / Gladimir Baranoski 
Should read "b=F(\theta)G(\theta)G(\delta) M 

• Pg. 750 / Equation 15.35 / Gladimir Baranoski 
Left-hand side should be s_c 

• Pg. 751/ Equation 15.36 / Gladimir Baranoski 

The 2 in the denominator of the last term should be a 4 

• Pg. 751/ 2nd paragraph after Equation 15.36 / Gladimir Baranoski 
Replace "as long as a < 0.2" with "as long as \sigma < 0.2" 

• Pg. 751 / Equation 15.37 / Gladimir Baranoski 

The 2 in the denominator of the last term should be a 4 

• Pg. 752 / First paragraph / Gladimir Baranoski 

Replace "and \alpha_x and \alpha_y represent" with "and \sigma_x and \sigma_y" 




Pg. 752 / Third paragraph in section 15.6.3 / Mike McCarthy 

The reference to [187] should be to [186a]. (See erratum for pg. B-15) 

Pg. 762 / Equation 15.43 / Andrew 

The formula should be: P_{S}(r,g_l,g_2,a) = r\frac{l-{g_l} A 2}{(l-g_l a) A 2} + (1- 
r)\frac {1 - {g_2} A 2} {(1 -g_2 a) A 2} 

Pg. 778 / Equation 15.87 / Stephen Diverdi 

In the argument to coth, the \sigma_s should be \sigma_a 

Pg. 778 / Equation 15.88 / Stephen Diverdi 

In line 1, the factor \sigma_a should be deleted from the denominator 
In line 1, the function coth should be cosh 
In line 2, \sigma_s should be \sigma_a 

Pg. 780 / Second Paragraph / Mike McCarthy 

The sentence should end, "a surface texture can be evaluated only at points on surfaces." 

Pg. 792 / Second paragraph / Gladimir Baranoski 
Replace "inderadianceUnit" with "Unit" 

Pg. 794 / Table 16.1 / Chris Faigle 

The upper limit of the three integrals in the right-hand column should be "t" rather than 
"s" 

Pg. 811/ Equation 16.58 / Gladimir Baranoski 
In last line, m_2 should be m_p 

Pg. 812 / Equation 16.61 / Gladimir Baranoski 
Replace ds in left-hand side with dt 

Pg. 826 / Second paragraph / Chris Faigle 

Replace "at the p points u_i." with "at the p points t_i." 

Pg. 830 / Equations 16.97 - 16.102 / Chris Faigle 

Replace each "a<=s<=b" under the max with "a<=t<=b" in these equations (there are six 
replacements) 

Pg. 831 / Equation 16.103 / Chris Faigle 
The "ds" on the right-hand side should be "dt" 

The entire right-hand side should be under a square-root sign (or a pair of brackets with 
the whole expression raised to 1/2). 

Pg. 834/ Equation 16.113 / Gladimir Baranoski 
Left-hand side should read <glh_2> 

Pg. 836/ Paragraph after Equation 16.123 / Gladimir Baranoski 
Should read "so x_n \rightarrow x as n \rightarrow \infty [343]." 

Pg. 840/ Second paragraph / Gladimir Baranoski 

Parenthetical phrase should read "(proportionally to l/\sqrt(n) for n samples." 




• Pg. 847/ Last paragraph / Gladimir Baranoski 

Some of the weights are mis-numbered. In the sentence starting with "Suppose that we 
have...", it should end with "w_4 = (N \rho)(\rho A 2) = N\rho A 3." The next sentence should 
end with "w_5 = N A 2\rho A 4." The next sentence should end with "w_6 = w_5." 

• Pg. 848/ Figure 16.17 / Gladimir Baranoski 

The bottom label should change from "w_5 = N A 2\rho A 5" to "w_5 = N A 2\rho A 4." 

• Pg. 856/ Equation 16.162 / Gladimir Baranoski 
The right-hand side term S_x should be S_k. 

• Pg. 857/ Paragraph after Equation 16.164 / Gladimir Baranoski 

Replace "by virtue of being in t, times the remaining" with "by virtue of being in t, plus 
the remaining" 

• Pg. 866/ Second paragraph / Gladimir Baranoski 
Replace "K_0(t,q) M with "k_0(t,q)" 

• Pg. 868 / 6th line from bottom / Atin Atul Kothari 
fix dsicussion to discussion: 

...is Golberg's discussion in [162]. 

• Pg. 869/ Exercise 16.2 / Gladimir Baranoski 
This exercise is garbled; ignore it. 

• Pg. 869/ Equation 16.203 / Gladimir Baranoski 
Replace "cos(n cos A {-l}s)" with "cos(n cos A {-l}t)" 

• Pg. 869/ Equation 16.204 / Gladimir Baranoski 
Replace the s in lines 1 and 3 with t 

• Pg. 880 / Eq. 17.16 / Atin Atul Kothari 

First argument to L A e after equal sign should be r: 

L(r, w A 0) = L A e(r, w A 0) + ... 

• Pg. 890 / Eq. 18.1 / Atin Atul Kothari 

First argument to L A e after equal sign should be r: 

L(r, w A 0) = L A e(r, w A 0) + ... 

• Pg. 890/ Paragraph before Equation 18.2 / Gladimir Baranoski 
Replace "\nu(r,s)" with "\nu(r,\vec(\omega)) M 

• Pg. 891 / Eq. 18.4 / Atin Atul Kothari 

First argument to L A e after equal sign should be r, in all three references: 

L(r, w A 0) = L A e(r, w A 0) + ... 

• Pg. 892 / 5th line from top / Atin Atul Kothari 
eliminate bogus word approximaindeione: 

An approximate solution L can be... 




Pg. 895 (Color section) / Figure 19.52 / Gustavo A. Patow 

The figures are shown in the wrong order. To match the captions, rearrange the images in 
the order (b,c,f,d,e,a). 

Pg. 896 / Equation 18.29 / Janne Koponen 

The F term in the second column of the second row should have the subscript 2,2. 

Pg. 898 / Figure 18.3 caption / Kim Dong Ho 
(c) Wall B radiating, A and C reflecting. 

Pg. 902 /1st line, Eq. 18.40 / Kim Dong Ho 
Lose K_{i,i} at start of first line 

Pg. 903 / Figure 18.5 / Chris Faigle 
Lines 1 and 6 should read "for i <- 1 to n". 

Pg. 904 / Figure 18.6 / Chris Faigle 
Lines 1 and 5 should read "for i <- 1 to n". 

Pg. 905/ Equation 18.43 / Gladimir Baranoski 
Replace + sign in lines 3 and 4 with - sign. 

Pg. 906 / Figure 18.7 / Chris Faigle 
Line 1: The 0 should be a 1 
Line 9: The 0 should be a 1 

Line 10: Replace K_j,i in the numerator of the rightmost term with K_k,i 
Pg. 912 / Figure 18.11 / Andrew 

The arrow rising to the top face should pass behind the near edge of that face, not in front 
of it. 

Pg. 913 / Equation 18.55 / Chris Faigle 

Delete the \Delta in front of the \rho in the numerator 

Pg. 914/ Equation 18.57 / Gladimir Baranoski 

Replace right-hand term "B_k A_k" with "YDelta B_k A_k" 

Pg. 917/ Paragraph after Equation 18.59 / Gladimir Baranoski 
Replace "Now if A_i" with "Now if dA_i" 

Pg. 922 / Figure 18.18 / Atin Atul Kothari 
Label q_l should be q_i 

Pg. 943 / Figure 18.35 / Janne Koponen 

In the lower half of part (d), the entry below and to the right of the root node "A" should 
be "A_3" 

Pg. 946 / Figure 18.37 / Dani Lischinski 
Label A just below A_1 in part (b) should be A_2. 




• Pg. 948 / 5th paragraph / Dani Lischinski 

Replace "number of input polygons" with "number of elements" 

• Pg. 948/ 6th paragraph / Gladimir Baranoski 

Replace "C-level patches are descendents" with "C-level patches are parents" 

• Pg. 953 / Next-to-last line / Dani Lischinski 
Subscript on Y_q should be g 

• Pg. 954 / Figure 18.44 / Dani Lischinski 
"struct Link }" should be "struct Link {" 

• Pg. 955 / Figure 18.46 / Dani Lischinski 
"InitBg" should be "InitBs" 

• Pg. 958 / Figure 18.51 / Dani Lischinski and Stephan Schaefer 
7th line should read 

"PushPullRad(r, 0)" 

• Pg. 964 / Figure 18.56 / Dani Lischinski 
"InitBg" should be "InitBs" 

• Pg. 966 / next-to-last line / Dani Lischinski 
First + should be = 

"...form factor matrix (tilde-K) = K + DK. Then..." 

• Pg. 967 / end of 1st paragraph / Dani Lischinski 
Final term K(tilde-B) should be in bold. 

• Pg. 967 / just before Eq. 18.88 / Dani Lischinski 
Reference [413] should be [414] 

• Pg. 969/ Equation 18.94 / Gladimir Baranoski 

The term "i\in I" should appear below the sigma in both lines. 

• Pg. 969/ Equation 18.95 / Gladimir Baranoski 

The first term in the right-hand side of the first line should change from \frac{ 1} {A_i} to 
\frac{ 1 }{A_I}. 

• Pg. 970 / Figure 18.62 / Dani Lischinski 
"InitBg" should be "InitBs" 

• Pg. 970 / Figure 18.62 / Dani Lischinski 
Line 4, Gamma_s should be Y_s 

• Pg. 970 / Figure 18.62 / Dani Lischinski 
"RefineLink(L)" should be "RefinelmpLink(L)" 

• Pg. 970 / Last paragraph / Dani Lischinski 
"SolvelmpRad" should be "SolvelmpHR" 




• Pg. 971 / 2nd paragraph / Dani Lischinski 
"SolvelmpRad" should be "SolvelmpHR" 

• Pg. 972 / Figure 18.64 / Dani Lischinski 
5th line, L.Y_g should be L.q.Y_s 

• Pg. 972 / Figure 18.65 / Dani Lischinski 
Insert a line before the first endfor: 

Recursively Set Y toO(r) 

• Pg. 973 / Figure 18.66 / Dani Lischinski 
Fourth cluster of lines: 

Each instance of L.p.Y should be L.p.Y_g 
the term n. Y should be n. Y_s 

• Pg. 973/ Equation 18.66 / Gladimir Baranoski 
Remove endwhile in next-to-last line 

• Pg. 977/ Equation 18.99 / Gladimir Baranoski 
The term \Phi_i A r should be \Phi_i A k 

• Pg. 990 / Equation 19.1 / Paul Lalonde 

The argument s in L A e(s, w A 0) should be an r. 

• Pg. 990 / Equation 19.3 / Paul Lalonde 

On both lines, the argument s in L A e(s, w A 0) should be an r. 

• Pg. 997 / Figure 19.8 / Andrew 

There should be two lines on the rightmost (gray-shaded) partial ellipse, showing the 
edges of the intervening partial ellipses (like the figure on the hemisphere). On the bottom 
corner should be shade (as on the hemisphere). 

• Pg. 1017 / Figure 19.27 / Andrew 

Remove dot at top of line coming out of point F 

• Pg. 1021 / 4th paragraph / Andrew 
Add references: 

"Distribution ray tracing [101] and path tracing [234] are the most..." 

• Pg. 1024 / Eq. 19.28 / Seth Teller & class 

The expression for d should be the discriminant, not its square root: 
si = (-b+\sqrt d)/2a 
s2 = (-bAsqrt d)/2a 
where 

d = b A 2- 4ac 

• Pg. 1031 / Equation 19.29 / Gladimir Baranoski 
Replace all three occurances of r with s. 



Pg. 1031 / Equation 19.30 / Gladimir Baranoski 
Replace all four occurances of r with s. 

Pg. 1048 / Figure 19.50 / Gladimir Baranoski 

Remove diffuse patch in the arc starting at the light source. 

Pg. 1048 / Figure 19.50 / Andrew 

Adapted from Chen, Rushmeier, Miller, and Turnder, "A progressive multi-pass method 
for global illumination" (Proc. Siggraph 91), Figure 1, pg. 166 

Pg. 1049 / Figure 19.51 / Andrew 

Adapted from Chen, Rushmeier, Miller, and Turnder, "A progressive multi-pass method 
for global illumination" (Proc. Siggraph 91), Figure 1, pg. 166 

Pg. 1087 / Equation A.3 / Chris Faigle 

In the third line, move "a<=t<=b" under the "max" 

Pg. 1090 / First paragraph under Eq. A. 14 / Janne Koponen 
"finite" should be "countable" 

Pg. 1124/ Equation FF4 / Nail Gatenby 

There are two terms that take the inverse-tangent (marked tan A {-l}) of fractions. Both of 
these should have a square root in the denominator. Thus these terms should be tan A { - 
1 }(X/sqrt(l+Y A 2)) and tan A {-l }(Y/sqrt(l+X A 2)). 

Pg. 1180/ Table G.10 / Davide Gadia 

The Y value for the Neutral 3.5 patch should be 9.00 

B-15/[187]/Andrew 

Missing citation: [186a] Pat Hanrahan and Jim Fawson, "A Fanguage for Shading and 
Fighting Calculations", Computer Graphics (Proc. Siggraph '90), 24(4), 289-298, August 
1990. 

B-28 / [373] / Andrew 
"Hanen" should be "Hanan" 

B-28 / [374] / Andrew 
"Hanen" should be "Hanan" 

B-37 / Ref. 499 / Anil Hirani 
Remove "and H. Webb" 

1-18 / G index / Chris Faigle 

Under "Galerkin method" and subhead "classical radiosity and", the indexed pages should 
be 892-893 




When you want to know a thing you have studied in your memory proceed in this 
way: When you have drawn the same thing so many times that you think you know 
it by heart , test it by drawing it without the model; but have the model traced on 
flat thin glass and lay this on the drawing you have made without the model, and 
note carefully where the tracing does not coincide with your drawing, and where 
you find you have gone wrong; and bear in mind not to repeat the same mistakes . 
Then return to the model, and draw the part in which you were wrong again and 
again till you have it well in your mind. 

Leonardo da Vinci 


... there ain’t nothing more to write about, and I am rotten glad of it, because if I’d 
a knowed what a trouble it was to make a book I wouldn’t a tackled it and ain’t 
agoing to no more. 


Mark Twain 

(“The Adventures of Huckleberry Finn,” 1884) 







The theory of digital image synthesis, or rendering is composed of three core topics: the human visual 
system, digital signal processing, and the interac/on of matter and energy'. This text provides a solid 
foundation in each of these fields, utilizing a consistent terminology and notation throughout. Two 
modern approaches to rendering—hierarchical radiosity and distribution ray tracing are discussed, 
with a focus on how they interweave these core disciplines to create efficient and accurate algorithms. 
Researchers and implementors who seek Xn increased understanding of today’s rendering techniques 
and a guide to the development of tomt^rrow’s will benefit from the clear presentation offered here. 

Clear, accessible, and comprehensive. Principles of Digital Image Synthesis is an import ant resource 
tor computer graphics prograrpfners and researchers alike. It is also a unique^ryLxakraMe textbook 
for intermediate and advaj>e^d computer graphics’courses. 


AryJjaw^Glassner% contributions to computer graphiyr span over 15 years. His work at Microsoft 
Research* Xerox PARC, the IBM Watson Research brbs, Bell Communications Research, and the 
Delft University of Technology has produced numerous technical articles on rendering theory and 
practice, animation, modeling, and new mcdia^Dr. Glassner also authored Computer (>raphics: 
A Handbook for Artists and Designers , editor Aw Introduction to Ray Tracing , and created the 
Graphics Gents series for programmers. A /Popular speaker, he often addresses both technical and 
general audiences on Topics ranging frormcomputer graphics and art to the ethics and politics of 
computers in society. Dr. Glassner is a rjffembcr of the editorial boards for AC At Transactions on 
Graphics and IEEE Computer Grapb/s & Applications. He has served on papers committees tor 
both the Kurographics and SiGGRAm conferences; and he chaired the Technical Papers Committee 
for SIGGRAPH '94. He also writes/iction, plays jazz piano, and enjoys painting and hiking. 

He holds a Ph.D. from the University of North Carolina at Chapel Hill. Currently, Dr. Glassner 
is creating new computer graphics at Microsoft Research. 
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Radiosity and Global Illumination , Francois Sillion and Claude Puech 
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Image synthesis transforms geometry and physics into meaningful images* Because rendering Algorithms 
change frequently, it is increasingly important for researchers and implementors to undersrandVhe basic 
principles of image synthesis. Andrew Glassner focuses on these principles, providing a comprenensive 
explanation of the three core fields of study that constitute digital image synthesis: the human visual 
system, digital signal processing, and the interaction of matter and light. Assuming no more thaim 
basic background in calculus, Glassner demonstrates how these disciplines arc elegantly orchestrated 
into modern rendering techniques such as radiosity and ray tracing, \ 


The Human a computer-generated image, we don’t perceive the underlying 

mathematical represtmtationTi^ the final result of layers of processing—first by the display 
hardware itself and then by our eyefcrajtd brain, both of which transform the displayed image. We 
must understand these transformaifonssbqhat the message contained in the image is not corrupted 
by the process. 1 

Digital Signal Processing: Digital signals, the numerical representations pf the real world, are at the 
heart of image synthesis. Understanding their natureSs critical to any rendering algorithm—to creating 
accurate simulations of real-world scenes and avoiding offensive image artifacts. 


Matter and Energy; To simulate the natural world, we must understand its laws, considering the nature 
of light, the physical structure of materials, and the interaction Vf the two. This study culminates in the 
radiance equation, the essence of all digital rendering algorithms\ 

Applications: Hierarchical radiosity and distribution ray tracing are derived from the general theory 
and presented in detail T he discussion shows how- these sophisticated rendering techniques are approx- 
imations of the full theory, delineates how many of today’s most powerful techniques are variants or 
combinations of these basic approaches, and supplies a context for eAtluaring future algorithms. 

Appendices: Seven appendices provide reviews of linear algebra and probability, a historical h>ok 
at light reflection and refraction, and a wealth of real-world data. \ 
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